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OM nucleic - nucleic search, using sw model 



Run on: August 15, 2004, 04:17:29 ; Search time 3155.99 Seconds 

(without alignments) 
9405.294 Million cell updates/sec 

Title: US-09-864-675-1 
Perfect score: 994 

Sequence: 1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 

Scoring table: I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 55026578 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : EST:* 

1: em_estba:* 

2: em_esthum:* 

3: em_estin:* 

4: em_estmu:* 

5: em_estov:* 

6: em_estpl:* 

7: em_estro:* 

8: em_htc:* 

9: gb_estl:* 

10: gb_est2:* 

11: gb_htc:* 

12: gb_est3:* 

13: gb_est4:* 

14 : gb_est5 : * 

15: em_estfun:* 

16: em_estom:* 

17: em_gss_hum: * 

18: em_gss_inv:* 

19: em_gss_pln:* 

20: em_gss__vrt : * 

21: em_gss_fun: * 

2 2 : em_gs s_mam : * 

2 3 : em_gs s_mus : * 

24: em_gss_pro:* 

25: em_gss_rod:* 

26: em__gss_phg: * 

2 7 : em_gs s_vr 1 : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
BI918620 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



ORIGIN 



BI918620 805 bp mRNA linear EST 16-OCT-2001 

603176570F1 NIH_MGC_121 Homo sapiens cDNA clone IMAGE : 5240969 5 1 , 
mRNA sequence. 
BI918620 

BI 918 620. 1 GI: 161822 95 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 805) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r @mail . nih . gov 

Tissue Procurement: Life Technologies, Inc. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http: //image. llnl . gov 
Plate: LLAM11607 row: k column: 18 
High quality sequence start: 2 
High quality sequence stop: 778. 

Location/Qualifiers 

1. .805 

/organism="Homo sapiens" 
/mol_type= "mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 5240969" 
/lab_host="DH10B" 
/clone_lib= ,, NIH_MGC_121 M 

/note="Organ: brain; Vector: pCMV-SPORT6; Site_l : NotI; 
Site_2: EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains, female age 20 weeks, female age 24 weeks, 
and male age 26 weeks. Library is oligo-dT primed and 
directional! y cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIH_MGC Library." 



Query Match 67.8%; Score 674; DB 12; Length 805; 

Best Local Similarity 98.7%; Pred. No. 2.3e-130; 
Matches 732; Conservative 0; Mismatches 5; Indels 
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64 



ATGAGGCGCGACCCGGCCCCCGGC-TTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG 59 
I M II I I I I I I I I I I I I I I I I I | | I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | 
ATGAGGCGCGACCCGGCCCCCGGCGTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG 123 



60 CTACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGA 
M I I I I I I I I I I I I I I I I I I I I I | | | || | | | | | || | | || | | | | | | | | | | | | | | | | || | | | 
12 4 CTACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGA 
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GGGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCC 
I I I I I I I I I M I II I I I II I II I | | | | | || | | | | M | | | | | | | | || | | | | | | | | | | | | | | 
GGGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCC 

GCCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGG 
I I I I M I I I I I M M I I M I I I I I I I I I I || | || | | | | | | | | || | | M | | | | | | | | || || 
GCCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGG 

GGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCA 
N I I I I I I I I I I I I M I I I II I I I I II I I I I II I I I I I I M I I I I I I | | | | | | | | || | | | 
GGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCA 

GCGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTG- 
I I I I I I I I I M I I I I I I I I I || I I | || | | | || | | | | | | || | | | | | | | | || | | | | | | || | 

GCGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGT 
CCCCCCTC GAT AC CAAC GGCAAAAAT CT CAAGAAAGAG GT G GGCAAGAT C C T GT GC ACT G 

I I N I I I I I I I I I I I I M | | | | | | | | | | M I I I I I I I I I I I I I || | | | | | | | | | || | | | | 

CCCCCCTC GAT AC CAAC GGCAAAAAT CT CAAGAAAGAG GT G GGCAAGAT C C T GT GC ACT G 
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ACT GC GC CAC C C GG C C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GG GT GAGA 4 78 

I I N I I I I I I I I I I I I I I | | | | | | | | | | | | | M | | | | | | | || | | | | | | || | || | || | || 

ACT GC GC C AC C C GG C C CAAGT T GAAGAAGAT GAC GAGC CAGAC G GGAC AGGT GGGT GAGA 543 



AGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCA 
N N I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I || | | | | || 
AGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCA 



539 AGGAT G GCAAGGAGCT CAAC C G- CAG C CGAGAC AT T C GC AT CAAAT AT GGCAAC G G CAGA 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | || 

604 AGGAT GGCAAGGAGCT CAAC C GT CAG C C GAGACAT T C GC AT CAAAT AT GGCAAC GG CAGA 
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AAGAAC T CAC GACT ACAGTT CAACAAG GT GAAGGT GGAGGACGC T G G G GA- GT AT GT CT G 656 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I 

AAGAAC T CAC GACT AC AGTT CAACAAG GT GAAGGT GGAGGACG CT G G G GAGGT AT GT CT G 723 

CGAGG-CCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCG 715 

I I I I I I I II I II I I I I I I II || | | | | | | | | | | M I I II 

CGAGGCCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGTTTTACGTCAACAGGT 783 

TGAGCACCACCCTGTCATCCTG 737 
I I I I I I I II I I I I I I I I I | | | 
T GAGCACCAAC CT GT CAT C C T G 8 05 



RESULT 2 
BM914622 

LOCUS BM914622 1047 bp mRNA linear EST 12-MAR-2002 

DEFINITION AGENCOURT_6615334 NIH_MGC_113 Homo sapiens cDNA clone IMAGE : 54 8 0308 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



ORIGIN 



5', mRNA sequence. 
BM914622 

BM914 622. 1 GI : 19365001 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1047) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Dr. Mark Watson 
cDNA Library Preparation: Rubin Laboratory 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : //image . llnl . gov 

Plate: LLCM2002 row: p column: 05 

High quality sequence stop: 541. 
Location/Qualifiers 
1. .1047 

/organism= M Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 5480308" 
/lab_host="DH10B (phage-resistant ) " 
/ cl one_lib= "NIH_MGC_1 13" 

/note= ,, Organ: spleen; Vector: pOTB7; Site_l: Xhol; Site_2 : 
EcoRI; cDNA made by oligo-dT priming. Directionally cloned 
into EcoRI/XhoI sites using the following 5 f adaptor: 
GGCACGAG (G) . Library constructed by Ling Hong in the 
laboratory of Gerald M. Rubin (University of California, 
Berkeley) using ZAP-cDNA synthesis kit (Stratagene) and 
Superscript II RT (Life Technologies). Note: this is a 
NIH_MGC Library." 



Query Match 59.0%; Score 586; DB 12; Length 1047; 

Best Local Similarity 98.2%; Pred. No. 6.4e-112; 

Matches 603; Conservative 0; Mismatches 10; Indels 1; Gaps 
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1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 M 1 1 M 1 1 1 1 1 1 M M 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

GCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTTTCCTGGAGCCCACGGAAC 
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AGCCCTTAGTCTTTTyVGACGGCCTTTGCCCCCCTCGATACCAACGGCAAAAATCTCAAGA 

1 ' 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 | || | | | | | | | | | M | | | | | | | | | | | , | || || 

AGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACCAACGGCAAAAATCTCAAGA 
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AAGAGGT G GGCAAGAT C C T GT G C AC T GACT G C G CCAC C C G G C C CAAGT T GAAGAAGAT GA 
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AGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTA 
1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 | | I | 
AGAGC C AG AC GG GAC AG GT G GGT GAGAAGC AAT CGCT GAAGT GT GAGG CAG C AG C C GGT A 
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240 
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ATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACA 571 
1 1 N 1 N 1 II 1 II II 1 1 1 | M | | | | | | | | | | | | | | | M 1 1 1 1 1 1 1 II I I | I 1 1 1 1 1 | | | | 
AT C C C CAGC C T T C CT AC C GT T G GT T CAAG GAT GGCAAGGAGC T CAAC C G CAGC C GAGAC A 300 


FiK 


O /I 1 
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T T C G CAT CAAAT AT GG CAAC G G C AGAAAGAAC T C AC GACT AC AGT T CAAC AAGGT GAAGG 

1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | | | M 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I | I | 

TT CGCAT CAAAT AT GGCAAC GGC A C± A A AH A AP T r A c (Z A r T A r a r tt r a a r* t\ n r r*v n a a r r* 


box 


nu 
UD 
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360 


Qy 


^ o o 


TGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCC 
M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 M 1 1 1 1 1 I I I I I | I | | | | | M | | | | | | | | | | | | | | | | 

TGGAGGACGCTGGGGAGTATGTrTnrnA^rr^a^aaraTrrTrrrrAArrA 


691 




dol 


420 


Qy 


692 


GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 

1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 

GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 


751 


DD 


421 


480 


Qy 


752 


GGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCG 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 1 1 M M 

GGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGCCTGCTACTACATCG 


811 


JJD 


481 


540 


Qy 


812 


AGGGC AT CAAC C AGCT C T C CT G CAAAT GT C CAAAT GGAT T CT T C GGACAGAGAT GT T T - G 

1 1 1 M 1 1 1 1 Mill 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 II 1 II 1 1 1 Ml 1 II 1 1 1 1 1 

AGGCCATCAATCAGCTTTCCTGCAAATGTCCCAATGGATTCTTCCGACCAACATGTTTGG 


870 


Db 


541 


600 


Qy 


871 


GAGAAACTGCCTTT 884 
1 1 1 1 1 1 II 1 II II 
GAGAAACTGCCCTT 614 




Db 
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RESULT 3 

BI412864/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BI412864 1041 bp mRNA linear EST 14-AUG-2001 

602988202F1 NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 5144 016 5', 
mRNA sequence. 
BI412864 

BI412864 . 1 GI : 151737 87 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 1041) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by:Incyte Genomics, Inc. 

Clone distribution: NCI-CGAP clone distribution information can be 



found through the I.M.A.G.E. Consortium/LLNL at: 
http : / / image . llnl . gov 
Plate: LLAM11355 row: d column: 01 
High quality sequence start: 11 
High quality sequence stop: 645. 
FEATURES Location/Qualifiers 
source 1. .1041 

/organism="Mus musculus" 

/mol_type="mRNA M 

/strain="Czech II" 

/db_xref="taxon: 10090" 

/clone="IMAGE: 5144016" 

/tissue_type="pooled lung tumors" 

/lab_host="DH10B (phage-resis tant ) " 

/ cl one_l ib= "NCI_CGAP__Lu3 3 " 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l: NotI; Site_2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5 ? 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] - 
Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares and M. Fatima Bonaldo. " 

ORIGIN 



Query Match 47.0%; Score 467.2; DB 12; Length 1041; 

Best Local Similarity 86.6%; Pred. No. 4.4e-87; 

Matches 563; Conservative 0; Mismatches 78; Indels 9; Gaps 4; 

165 CAGCACCCGAGAGCCGCCCGCCTCGGGTCGGGT GGCGTTGGTAAAGGTGCTGGACA 22 0 

N I I I I II I I I I I I I M I I I I III I I I I I || | | | | | | M I M 

Db ^56 CACCTCGAGATGCCCGCCCGCCTCGGGTTCGGTTGGCGTCTTGGTGAAAGGTGCTGGACA 597 

221 AGTGGCCG— CTCCGGAGCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 27 8 
HI INI Mill II I I I I I I I II I I | | M I I I I I I I I I I I I I | | | | | | | M I I I 

Db 596 AGTTGCCGGCTCCCGGATCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 537 

Qy 279 TGTGCCGCTCGAAAGGT^ACCAGCGCTACATCTT-TTTCCTGGAGCCCACGGAACAGCCCT 337 

H I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I II I I I I I I I 

Db 536 TGCGCCGCTCGAAAGGAACCAGCGCTACATCTTGTTTCCTGGAGCCCACCGAGCAGCCCT 477 

Qy 338 T AGT CT T TAAGAC GGCCTTTGCCCCCCTC GAT AC CAAC GGCAAAAAT C T CAAGAAAGAGG 397 

I I I I I I I I I I I I II I I I I I I I || I | I | | | | | | | M | | | | 

Db 47 6 TAGTT TT TAAGAC AGCCT T TTGCCCCG GT C GAC C CTAC GGCAAAT AC AT CAAGAAAGAG G 417 

Qy 398 T G GG CAAGAT C CT GT GCACT GAC TGCGC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC C 457 

I I I I I I I N N I I I I I I I I | | | M I I I I I I I I I II I I I | | | | || || | | | | | | | | | | | | | 
Db 416 T GGGCAAGAT C CT GT GC AC T GAC T GC GC C AC C C GGCC CAAG C T GAAGAAGAT GAAGAGC C 357 

Qy 45 8 AGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCC 517 

I I I I HI I I I I I I I I M I M I M Mh.: II I I I I I I I | | I | | II || || | | | I 
Db 356 AGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCAAGTGTGAGGCAGCGGCGGGAAACCCCC 297 



Qy 



518 AGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCA 577 
I "I II I I I I I I I I | M I I I I I I I I || | | | | | M 



296 AGCCCTCCTATCGCTGGTTCAAGGATGGCAAGGAACTCAACCGGAGTCGTGATATTCGCA 237 



Ov 


S7 ft 


I LAAAl Al GG L AACG G C AGAAAGAACT C AC GACT ACAGT T CAACAAGGT GAAG GT GGAG G 

I I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 | | || | | | | | | | | | | M 1 

TCAAGTATGGCAATGGCAGAAAGAACTCACGGCTACAGTTCAACAAAGTGAGGGTGGAGG 


637 




^ 3 D 


177 




DO 0 


ACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCC 

1 II MINIM 1 1 1 1 1 I I 1 1 1 I 1 | | | || | | | | | in im | | | | M 

ATGCCGGGGAGTACGTCTGTGAGGCCGAGAACATCCTTGGGAAGGACACCGTGAGGGGCC 


697 


UD 


1 / b 


117 


Qy 


698 


GGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGT 

1 II 1 1 1 1 1 1 1 II 1 1 1 1 I I I || | | | | | || | | M | II II 1 

GAC T C CAT GT CAACAGC GT GAG CAC CACT CT GT CAT C CT GGT C G GGAC AT GC C C G GAAGT 


757 


Db 


116 


57 


Qy 


758 


GCAACGAGACAGCCAAGTCCTA— TTGCGTCAATGGAGGCGTCTGCTACT 805 

M MINIM II 1 1 II 1 1 1 1 II 

GCAATGAGACCGCCAAGTCCTACCATGTGTGAATGGAGGCGTGTGCTACT 7 




Db 


56 





RESULT 4 
BX281777 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 

Neubert,P., Partsch,E., Peters, M. , 



BX281777 524 bp mRNA linear EST 04-MAR-2003 

BX281777 NIH_MGC_121 Homo sapiens cDNA clone IMAGp998K18 11607 ; 
IMAGE: 5240969, mRNA sequence. 
BX281777 

BX281777.1 GI:28612804 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 524) 
Ebert,L., Heil,0., Hennig,S., 
Radelof,U. f Schneider, D. and Korn,B, 
Human UnigeneSet - RZPD3 
Unpublished (2003) 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998K1811607. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http: / /www. rzpd.de/CloneCards/cgi- 

bin/showLib.pl.cgi/response?libNo=972 Contact: Ina Rolfs 
RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Heubnerweg 6, D-14 059 Berlin, Germany 
Tel: +49 30 32639 101 
Fax: +49 30 32639 111 
www. rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
M13u, Primer sequence: CGTTGTAAAACGACGGCCAGT . 

Location/Qualifiers 

1. .524 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db xref="taxon: 9606" 



/clone="IMAGp998K1811607 ; IMAGE: 524 0969" 

/lab_host= M DH10B n 

/clone_lib="NIH_MGC_121 n 

/note="Organ: brain; Vector: pCMV-SPORT6; SiteJL: NotI; 
Site_2: EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains, female age 20 weeks, female age 24 weeks, 
and male age 26 weeks. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIH_MGC Library." 

ORIGIN 



Query Match 47.0%; Score 467; DB 13; Length 524; 

Best Local Similarity 100.0%; Pred. No. 3.6e-87; 

Matches 467; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


l 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M | | | | M II M M 1 1 1 1 1 1 1 1 1 M 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


58 


117 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 N 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 | M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 
TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


118 


T 77 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 M | || | | | | | || | | || | | | | | || | | | | | | | | | | | | | | | | 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


178 


237 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
N 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 
CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


238 


297 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 
1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I 
GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


298 


357 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

1 M 1 i M 1 ! | | | | | | | | | | | || | M 1 II 1 1 1 1 1 1 I M || || | | | | | | | | | | | | | | 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


358 


417 


Qy 


361 


C C C C T C GAT AC C AAC G G C AAAAAT C T C AAG AAAGAG GT G G G C AAG AT C C T GT G C AC T G AC 
1 1 i 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II II M M 1 1 1 II 1 1 1 II 1 1 1 II 1 I I I I I I | | || I | | 
C C C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGG CAAGAT C CT GT GC ACT GAC 


420 


Db 


418 


477 


Qy 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACA 4 67 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 I I I || | | | | | 
T GC GC CAC C C G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC A 524 




Db 


478 





RESULT 5 
AA706226 

LOCUS AA706226 549 bp mRNA linear EST 12-JAN-1999 

DEFINITION ah28a07.sl Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 

1240116 3* similar to TR:P43328 P43328 NEU DIFFERENTIATION FACTOR 



NDF04 ;, mRNA sequence. 
AA706226 

AA706226.1 GI:2716144 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 549) 

NCI-CGAP http : / /www . ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 
cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/ bbrp/ image /image . html 

Possible reversed clone: similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 689 Std Error: 0.00 
Seq primer: -4 0ml3 fwd. ET from Amersham 
High quality sequence stop: 451. 
FEATURES Location/Qualifiers 
source 1. .549 

/organism="Homo sapiens" 
/mol_type= ,, mRNA M 
/db_xref="taxon: 9606" 
/clone="1240116" 

/ tissue_type="parathyroid tumor" 
/dev_stage="adult" 

/lab__host="DH10B (ampicillin resistant)" 
/ clone_lib="Soares_parathyroid_tumor_NbHPA" 
/note="0rgan: parathyroid gland; Vector: pT7T3D 
(Pharmacia) with a modified polylinker; Site_l: Not I; 
Site_2: Eco RI; 1st strand cDNA was primed with a Not I - 
oligo(dT) primer 

[ 5 1 -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded c DNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M. Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

ORIGIN 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



Query Match 40.6%; Score 404; DB 9; Length 549; 

Best Local Similarity 98.5%; Pred. No. 5.5e-74; 

Matches 4 07; Conservative 0; Mismatches 6; Indels 0; Gaps 



0; 



Qy 


424 


Db 


15 


Qy 


484 


Db 


75 


Qy 


544 


Db 


135 


Qy 


604 


Db 


195 


Qy 


664 


Db 


255 


Ov 


724 


Db 


315 


Qy 


784 


Db 


375 



G C C AC C C GGC C CAAGT T GAAGAAG AT GAAGAG C CAGACG G GACAG GT GG GT GAGAAG CAA 4 83 

I I I M I I I I I I I I I I I I M I I I I I I I I II I I I i I M | | | | | | | | | | | | | | | I | | | | | | | | 

G C C AC C C GGC C CAAGT T GAAGAAGAT GAAGAG C CAGACG G GACAG GT GGGT GAGAAGC AA 7 4 

TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

Ml I I II II I I M I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I 

TCGCTGAAGTGTGAGGCAGCAGCGGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 134 

GG C AAG GAG CT CAACC G C AGC C GAGACAT T C G CAT CAAAT AT G GC AAC GG C AGAAAGAAC 603 

I M I I I I I I I I I I I I I I I I I I I I I I I I | I | | | I I I I I I I I I I I I II I I I II I I I I I II I I 
GGCAAG GAG CT CAAC C GC AG C C GAGACAT T C G CAT CAAAT ATG GCAAC GG CAGAAAGAAC 194 

TCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCC 663 
I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | || | M | | | | | | | | | | M I I I I I I I 
TCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCC 254 

GAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACC 723 

I M I I I II II I I I I I I I I I I I II I I || I I I I | | | | | | M | | Ml 

GAGAACATCCTGGGGAAGGACACCGTCGGAGGCCGGCTTTACGTCAACAGCGTGACGACC 314 

AC C CT GT CAT C CT GGTCGGGGCACGCCCGGAAGT GCAAC GAGACAGC CAAGT CCT AT TGC 78 3 

I I I I I I I I I I M I I I II I I I I II II I I II I I | | | | | | | | | | M I I I I I I I I I I I I I I II 

ACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGNGACAGCCAAGTCCTATTGC 374 

GTCAATGGAGGCGTCTGCTACTACATCGAGGGCAT CAAC CAGCTCT CCT GCAA 836 
I I I I I I M I I I I I I I I I I II I I I II I II I I I II I I I I I I I I I I I II II I I I II 
GTCAATGGAGGCGTCTGCTACTACATCGAGGGCAT CAAC CAGCTCT CCT GCAA 427 



RESULT 6 
AI041451 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AI041451 412 bp mRNA linear EST 28-AUG-1998 

ow36c02.sl Soares_parathyroid__tumor_NbHPA Homo sapiens cDNA clone 
IMAGE: 1648898 3' similar to TR:014511 014511 NTAK. ;, mRNA 
sequence. 
AI041451 

AI041451.1 GI:3280645 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 412) 

NCI-CGAP http : / /www. ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r @mail . nih . gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 
cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

www-bio . llnl . gov/bbrp/image/image . html 



Trace considered overall poor quality 
Insert Length: 671 Std Error: 0.00 
Seq primer: -4 0ml3 fwd. ET from Amersham 
High quality sequence stop: 1. 
FEATURES Location/Qualifiers 
source 1. .412 

/organism="Homo sapiens" 

/mol_type="mRNA" . 

/db_xref="taxon: 9606" 

/clone="IMAGE: 1648898" 

/ tissue_type="parathyroid tumor" 

/dev_stage=" adult" 

/lab_host="DH10B (ampicillin resistant) " 
/ clone_lib="Soares_parathyroid_tumor_NbHPA" 
/note="Organ: parathyroid gland; Vector: pT7T3D 
(Pharmacia) with a modified polylinker; Site_l: Not I; 
Site_2: Eco RI; 1st strand cDNA was primed with a Not I - 
oligo(dT) primer 

[ 5 ' -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3'], double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

ORIGIN 



Query Match 39.8%; Score 395.6; DB 9; Length 412; 

Best Local Similarity 97.6%; Pred. No. 2.8e-72; 

Matches 4 01; Conservative 0; Mismatches 10; Indels 0; Gaps 0; 



Qy 


426 


CACCCGGCCCAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAAT C 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 1 II 1 1 1 I I I I I I I | | | | | | M 1 1 II 1 1 1 1 1 1 1 1 
CACCCGGC CCAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAAT C 


485 


Db 


1 


60 


Qy 


486 


GCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGG 

1 1 1 1 1 II M 1 1 1 | M 1 1 1 1 II 1 1 1 I | | | | || | | | | | | 

GCTGAAGTGTGAGGCAGCAGCGATAAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGG 


545 


Db 


61 


120 


Qy 


546 


CAAGGAGCT CAAC C GCAGCC GAGACAT T C GC AT CAAAT AT GG CAAC G G C AGAAAGAACT C 

1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 II | 1 | || | M || | | | | | | | || | || | | | | | | || | | | | 

CAAGGAGCT CAACCGCAGCC GAGACAT T C GC AT CAAAT AT GGCAACGGCAGAAAGAACT C 


605 


Db 


121 


180 


Qy 


606 


ACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGA 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 | || | 1 | || | | | | | | 

ACGACTACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GCGAGGC C GA 


665 


Db 


181 


240 


Qy 


666 


GAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCAC 
1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 I 1 1 1 1 1 I I I | | | | | | | || | || | || | | | | | 
GAACAT C CT GGGGAAGGAC AC C GT AC GAG GC C G G C T T T AC GT CAACAGC GT GAC GAC CAC 


725 


Db 


241 


300 


Qy 


726 


CCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 


785 



Db 



301 CCTGTCATCCTGGTCGGGGCACGCCGGGAAGTGCAACGNGACAGCCAAGTCCTATTGCGT 360 



Qy 

Db 



78 6 CAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA 836 

I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I | | I I 
361 CAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA 411 



RESULT 7 
BX529505 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BX529505 488 bp mRNA linear EST 27-JUN-2003 

BX529505 NCI_CGAP_Mam3 Mus musculus cDNA clone IMAGp998N017639 ; 
IMAGE: 3153984, mRNA sequence. 
BX529505 

BX529505. 1 GI: 32297 8 63 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 488) 

Heil,0., Ebert,L., Neubert,P., Peters, M. , Radelof,U., Schneider, D. 
and Korn,B. 

Mouse UnigeneSet - RZPD2 
Unpublished (2003) 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998N017639. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collections- 
Mouse UnigeneSet - RZPD2 (RZPDLIB No. 981) 
http : //www. rzpd.de/CloneCards/cgi- 

bin/showLib.pl. cgi/response?libNo=981 Contact: Ina Rolfs 
RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Heubnerweg 6, D-14059 Berlin, Germany 
Tel: +49 30 32639 101 
Fax: +49 30 32639 111 
www. rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
SP6, Primer sequence: ATTTAGGTGACACTATAG. 

Location/Qualifiers 

1. .488 

/organism= M Mus musculus" 
/mol_type= M mRNA" 
/strain="129, C57BL/6 J, FVB/N" 
/db_xref="taxon: 10090" 

/clone="IMAGp998N017639 ; IMAGE : 3153984 " 
/tissue_type="tumor, gross tissue" 
/dev_stage="10 months" 
/lab__host="DH10B" 
/clone_lib="NCI_CGAP_Mam3" 

/note="Organ: mammary; Vector: pCMV-SP0RT6; Site_l: Sail; 
Site__2: NotI; Cloned unidirectionally . Primer: Oligo dT. 
Library constructed by Life Technologies. Investigators 
providing samples: Lothar Hennighausen/Chu-Xia Deng, NIH 
Reference for transgenic model: Xu et al . , Nature Genetics 
22, 37-43 (1999) . " 



ORIGIN 



Query Match 38.3%; Score 381; DB 13; Length 488; 

Best Local Similarity 91.0%; Pred. No. 3.4e-69; 

Matches 405; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 4 69 GTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTAC 52 8 

I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I || || | | | | | | I | | I | | | 
Db 1 GTGGGTGAGAAGCAGTCGCTCAAGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTAT 60 

QY 529 CGTTGGTT CAAGGAT GGCAAGGAG C T CAAC C G C AGCC GAGACAT T C G CAT C AAAT AT G GC 588 

M 1 I I I I I I I I I II I M I I I II I I I I I I I I II II II I II I I I I I I II II II I I 
Db 61 CGCTGGTTCAAGGATGGCAAGGAACT CAAC CGGAGTCGT GAT ATT CGCATCAAGTATGGC 120 

Qy 58 9 AAC GG CAGAAAGAACT C AC GAC T AC AGTT CAACAAGGT GAAGGT GGAG GAC GCT GG GGAG 64 8 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I M II I I II I I I I II II I II I I II I I 
Db 121 AAT GGCAGAAAGAACT CAC GGCTACAGTT CAACAAAGT GAGGGT GGAGGAT GC CGGGGAG 180 

Qy 649 TATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC 7 08 

M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I Ml 
Db 181 TAC GT CT GT GAGGC CGAGAAC AT C CT T GGGAAG GACAC C GT GAGG GGC C GAC T C C AT GT C 24 0 

Qy 709 AACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACA 768 

I M I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | || I I I I I I I I I I | | | | | | | | | 
Db 241 AACAGC GT GAGC AC CACT CT GT CAT C CT GGT C GGGACAT GC C C GGAAGT GC AAT GAGACC 300 

Qy 769 G C CAAGT C C T ATT GC GT CAAT GGAGG C GT CT G CT AC TAC AT C GAGG GC AT CAAC CAGCT C 82 8 

M I I I I M I I I M M I I I I I I II I I I I I II I I II I I I I I I I I I I I I I I I I I | | | | | 
Db 301 GCCAAGTCCTACTGTGTGAATGGAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTC 360 

Qy 829 T C CT GCAAAT GT C CAAAT G GAT T CTT C GGAC AGAGAT GT T T GGAGAAAC T GC CT T T GC GA 888 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I | | | | || | 
Db 361 T C CT G CAAAT GT C C AAAC GGAT T CTT C G GAC AGAGAT GT T T GGAGAAACT GC CT T T GC GA 42 0 

Qy 889 T T GTAC AT GC C AGAT C CT AAGCAAA 913 

I I I I I I I I I I I I I I I I I I I I I I I II 
Db 421 T T GTAC AT G CC AGAT C C T AAG C AAA 445 



RESULT 8 
BF108794 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 



BF108794 427 bp mRNA linear EST 20-OCT-2000 

7152g03.xl Soares_NSF_F8_9WJDT_PA_P_Sl Homo sapiens cDNA clone 
IMAGE: 3525292 3 1 similar to SW : NTAK_HUMAN 014511 NTAK PROTEIN 
; contains MSRl.tl MSR1 repetitive element ;, mRNA sequence. 
BF108794 

BF108794.1 GI:10938484 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 427) 

NCI-CGAP http: //www. ncbi . nlm. nih . gov/ncicgap. 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 



COMMENT Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Seq primer: -4 0UP from Gibco 
High quality sequence stop: 396. 
FEATURES Location/Qualifiers 
source 1. .427 

/organism="Homo sapiens" 

/mol_type= M mRNA" 

/db_xref="taxon: 9606" 

/clone="IMAGE: 3525292" 

/lab_host="DH10B" 

/clone_lib=="Soares_NSF_F8_9W_OT_PA_P_Sl" 

/note="Organ: pooled; Vector: pT7T3D-Pac (Pharmacia) with 
a modified polylinker; Site__l: Not I; Site_2 : Eco RI; 
Equal amounts of plasmid DNA from five normalized 
libraries were mixed, and ss circles were made in vitro. 
Following HAP purification, this DNA was used as tracer in 
a subtractive hybridization reaction. The driver was 
PCR-amplified cDNAs from pools of 5,000 clones made from 
the same 5 libraries. The pools consisted of the following 
libraries and clonelDs : Soares NbHSF pool 1: 
309384-310919, 323208-325895 Soares Nb2HP pool 1: 
145032-147335, 147720-148103, 148872-149255, 15002 - 
150407, 151176-152327 Soares Nb2HF8-9W pool 1: 
758280-760583, 772104-774407 Soares NbHPA pool 1: 
304776-306311, 320136-322 823, 326280-326663 Soares NbHOT 
pool 1: 723720-726407, 739080-740999 Subtraction by Bento 
Soares and M. Fatima Bonaldo." 

ORIGIN 

Query Match 36.6%; Score 363.8; DB 10; Length 427; 

Best Local Similarity 91.3%; Pred. No. 1.3e-65; 

Matches 386; Conservative 0; Mismatches 37; Indels 0; Gaps 0; 

Qy 363 C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT C CT GT GC AC T GACT G 422 

I I I I I I I III I I I I II I I I I I I || 

Db 5 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 64 

QY 423 C GC C AC C C G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGACAGGT GG GT GAGAAGC A 4 82 

I M I I I I I I M I II I I I I I I I I I I I I I I I I I || I I I I | | || M I I II I I I I I I I I I I I I 
Db 65 AGC C AC C C G GC C CAAGTT GAAGAAGAT GAAGAG C CAGAC GGGACAGGT GGGT GAGAAGC A 124 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I M I I I I I M I I I I I I I M I I I I I I I I I I I I I I I || | | | | | | | | | | | | | | | | | | | | | M 
Db 125 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 184 

Qy 543 T GGCAAGGAGCT CAACC GCAGCC GAGAC AT T C GCAT CAAATAT GGCAACGGCAGAAAGAA 602 

I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I || I I I I I I | M I I 
Db 185 T GGCAAGGAGCT CAAC C G C AGC C GAGAC AT T C G CAT CAAATAT GGCAAC GG C AGAAAGAA 24 4 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I M I M I I I I I I I II I I I I I | | M I I I I I I I I II I I I I I | | | | | | | M I M I II I I I 
Db 245 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 304 



Qy 



663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 



Db 305 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 364 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | M I I I I I I I I I I I I I I I I I I I II I I I 
Db 365 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 424 

Qy 783 CGT 785 

I I I 

Db 425 CGT 427 



RESULT 9 

BI410828/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BI410828 949 bp mRNA linear EST 14-AUG-2001 

602963734F1 NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 5119065 5', 
mRNA sequence. 
BI410828 

BI41082 8. 1 GI: 15171751 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 949) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by:Incyte Genomics, Inc. 

Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / /image . llnl . gov 
Plate: LLAM11290 row: d column: 10 
High quality sequence start: 28 
High quality sequence stop: 919. 
Location/Qualifiers 
1. .949 

/organism="Mus musculus" 
/mol_type= "mRNA" 
/strain="Czech II" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 5119065" 
/tissue_type-"pooled lung tumors" 
/lab_host="DH10B (phage-resistant ) " 
/clone_lib="NCI_CGAP_Lu33" 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l: NotI; Site_2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5* 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] . 
Double-stranded cDNA was ligated to Eco RI adaptors 



(Pharmacia), digested with Not I and cloned into the Not 

I and Eco RI sites of the modified pT7T3 vector. Library 

went through one round of normalization, and was 

constructed by Bento Soares and M. Fatima Bonaldo. " 

ORIGIN 

Query Match 31.9%; Score 317.4; DB 12; Length 949; 

Best Local Similarity 84.2%; Pred. No. 9e-56; 

Matches 442; Conservative 0; Mismatches 71; Indels 12; Gaps 7; 

Qy 3 97 GT GG GCAAGAT C CT GT G CACT GAC T G C GC C AC CC GG C C C AAGT T GAAGAAGAT GAAGAG C 456 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M | | 
Db 947 GT GGGC C AGAT C CT GG G CAC T G- C T G C GC C AC C C GC C C C AA- C T GAAGAAGAT GAAG A- C 891 

QY 457 C AGAC GGGAC AG GT G GGT GAGAAGCAAT C GC T GAAGT GT GAG GC AG C AG C C G GTAAT C C C 516 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I 

Db 8 90 CAAACCAGAAGAGTCGGTGAGAACAGTTCGCTCAAGTGTGAGGCACGGCCGGGGAAACCC 8 31 

Qy 517 CAGCCTTCCTACC GT T GGT T CAAG GAT GGC - AAG GAGC T CAAC C GC AG C CGAGAC 57 0 

I I IN M I I I I I I I II I I I I I II I I I I I I I I I I I I I II I I 

Db 830 C C C CAC C CC T C C CT AT CGCTGGTTT CAAG GAT GGC AAAGGAAC T CAAC C GGAGT CGT GAT 771 

Qy 571 AT T C GC AT CAAAT AT G G CAAC GG C AGAAAGAACT CAC GACT ACAGT T CAAC AAGGT GAAG 630 

I I I I I I I I I I I MM Ml I II II I I I I M I I I I I I II I I I I I II I I I I I II 
Db 770 AT T C G CAT CAAGT AT GC CAAT GGC AGAAAGAACT CACGG CT AC AGT T CAACAAAAGT GAG 711 

Qy 631 GT — GGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCG 68 8 

M I I M I II I II I I Mill I I I II I II I II II II II I I I I II I I I I I I I 
Db 710 GT T GGAGGAT T GC C GG G GAGT AC GT C T GT GAG GC CGAGAACAT C C T T GGGAAGGACAC C G 651 

Qy 689 TCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACG 748 

I M I I I I II I I I I I II I I I I II I I I I II I I I II II I M II I II II M II I 
Db 650 T GA- GGGC C GACT C CAT GT CAACAGCGT GAGC AC CACT CT GT CAT C CT GGT C GGGACAT G 592 

Qy 749 CCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACA 808 

I M II II I I II I I Mill I II I II II I II II II II II II II I II || I I I I M || 

Db 591 CCCGGAAGTGCAATGAGACCGC CAAGT CCT ACT GTGTGAATGGAGGCGTGTGCT AC TACA 532 

Qy 809 T C GAGGGC AT CAAC CAGC T CT CCT GCAAAT GT C CAAAT GGAT T C T T C GGAC AGAGAT GT T 868 

M I I M II I I II I II II I II I II I II II I II II || || || || || || || | M || || | | || | 
Db 531 T C GAGG GC AT CAAC C AGCT C T C C T GCAAAT GT C CAAAC G GATT CT T C GGAC AGAGAT GT T 472 

Qy 8 69 T GGAGAAACTGC CTTT GC GAT T GT ACAT GCCAGAT CCTAAGCAAA 913 

II II II II I I I II II I I I II I I I I II I II II II II II I I I II II I 

Db 471 T GGAGAAACT GC CT TT G C GAT T GT AC AT GC CAGAT C CTAAG CAAA 427 



RESULT 10 

BI651936 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BI651936 795 bp mRNA linear EST 12-SEP-2001 

603298677F1 NCI_CGAP_Mam3 Mus musculus cDNA clone IMAGE : 5339251 5', 
mRNA sequence. 
BI651936 

BI651936. 1 GI : 15566172 
EST. 

Mus musculus (house mouse) 
Mus musculus 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 
1 (bases 1 to 795) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Lothar Hennighausen Ph.D., Chu-Xia Deng Ph.D. 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : //image . llnl . gov 
Plate: LLAM11861 row: j column: 20 
High quality sequence stop: 795. 

Location/Qualif iers 

1. .795 

/organism= M Mus musculus" 
/mol_type="mRNA" 
/strain="129,C57BL/6J, FVB/N" 
/db_xref="taxon: 10090" 
/clone-" IMAGE: 5339251" 
/tissue__type=" tumor, gross tissue" 
/dev_stage="10 months" 
/lab_host="DH10B" 
/clone_lib="NCI_CGAP_Mam3 " 

/note="Organ: mammary; Vector: pCMV-SPORT6; Site_l: Sail; 
Site_2: NotI; Cloned unidirectionally . Primer: Oligo dT. 
Library constructed by Life Technologies. Investigators 
providing samples: Lothar Hennighausen/Chu-Xia Deng, NIH 
Reference for transgenic model: Xu et al . , Nature Genetics 
22, 37-43 (1999) . " 



ORIGIN 



Query Match 27.4%; 
Best Local Similarity 92.3%; 
Matches 298; Conservative 



Score 272.6; DB 12; 
Pred. No. 1.9e-46; 
0; Mismatches 24; 



Length 795; 



Indels 



1; Gaps 



l; 



Qy 



Db 



592 GGCAGAAAGAAC T C AC GAC - T ACAGT T CAACAAGGT GAAGGT G GAG GAC G CT G GGGAGT A 650 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 1 I I I I I I I I II I II I I I I I I II 
1 G G CAGAAAGAACT C AC GGCT T ACAGTT CAACAAAGT GAGG GT G GAGGAT GCC GGGGAGT A 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 



651 



61 



711 



771 



181 



TGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAA 
I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I || I II I I I I I I I II I I I I I I 
C GT CT GT GAGGC C GAGAAC AT C CT T G G GAAGGAC AC C GT GAG G GGC C GACT C CAT GT CAA 



CAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC 
I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I II I I I I I I I II I I I II I I I I I II 
121 C AGC GT GAG CAC CAC T CT GT CAT C CT G GT C G GGACAT GC C C G GAAGT GCAAT GAGAC C GC 



CAAGT C C TAT T GC GT CAAT G GAG G C GT CT GCT AC T ACAT C GAGG GC AT CAAC C AGC T C T C 
I I I I I I I I I M M I I I I II I I I I I I I II I I I II I I I I I I I I I II I I I I I I I I I I I I 
CAAGT C CTACT GT GT GAATGGAGGCGTGTGCTACTACATC GAG GGC AT CAAC C AGC TCTC 



710 



120 



770 



180 



830 



240 



Qy 



831 CT GCAAAT GT C CAAAT G GAT T CT T CG GAC AGAGAT GT T T GG AGAAACT G C C T T T GC GAT T 890 



241 CTGCAAATGTCCAAACGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATT 300 

Qy 891 GT AC AT G C CAGAT C CT AAG CAAA 913 

I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 GT AC AT G C CAGAT C CT AAG CAAA 323 



RESULT 11 

BE983573 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BE983573 333 bp mRNA linear EST 29-APR-2002 

UI-M-CG0p-bgi-c-07-0-UI.sl NIH_BMAP_Ret4_S2 Mus musculus cDNA clone 
UI-M-CG0p-bgi-c-07-0-UI 3', mRNA sequence. 
BE983573 

BE983573.1 GI:10654893 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 333) 

Bonaldo, M. F. , Lennon,G. and Soares,M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: Chin, H 

National Institute of Mental Health 

6001 Executive Blvd. Room 7N-7190, MSC 9643, Bethesda, MD 

20892-9643, USA 

Tel: 301 443 1706 

Fax: 301 443 9890 

Email : mEST@mail . nih . gov 

Oligo-dT track not found, Not I site shown in beginning of sequence 
is likely internal to the message. cDNA Library Preparation: M.B. 
Soares Lab Clone distribution: Researchers may obtain BMAP cDNA 
clones from RESEARCH GENETICS. It should be noted that Bento Soares 
is generating a small number of additional specialized 
non-redundant arrays of BMAP cDNAs whose availability will be 
considered under appropriate and limited collaborative arrangements 
The tissue for this library was contributed by Dr. Xin-Yuan Fu, 
Yale University School of Medicine The following repetitive 
elements were found in this cDNA sequence: 15-105, 
>GC_rich#Low_complexity 
Seq primer: M13 Forward 
P0LYA=No . 

Location/ Qualifiers 
1. .333 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/db_xref="taxon: 10090" 

/clone="UI-M-CG0p-bgi-c-07-0-UI ,f 

/lab_host="DH10B (Life Technologies)" 

/clone_lib="NIH_BMAP_Ret4_S2" 

/note="Vector: pT7T3D-Pac (Pharmacia) with a modified 



polylinker; Site_l: Not I; Site_2: Eco RI; The 
NIH_BMAP_Ret4_S2 library is a subtracted library, 
ultimately derived from mouse retina tissue libraries at 
various stages of development. For a detailed description 
of the library from which this clone was derived, please 
visit our web site at brainest.eng.uiowa.edu. The tissue 
for this library was contributed by Dr. Xin-Yuan Fu, Yale 
University School of Medicine 
TAG_SEQ=None found" 

ORIGIN 



Query Match 23.6%; Score 234.4; DB 10; Length 333; 

Best Local Similarity 95.6%; Pred. No. 1.3e-38; 

Matches 241; Conservative 0; Mismatches 11; Indels 0; Gaps 0; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | || | | M 1 1 1 1 1 1 II II 1 1 1 M II 1 1 II 1 1 1 1 1 1 1 1 1 1 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


82 


141 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 I | | | | | | | | | | | | | || 
TACTCGCCCAGCCTCAAGTCGGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


142 


201 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 I I I | || | | M | | | | | | | | | | | | | | | 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCG 


180 


Db 


202 


261 


Qy 

Db 


181 
262 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I | | | | | | | | | || | M 1 | | | | | | | | | | 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 
321 


Qy 


241 


GGGCTGCAGCGC 252 
1 1 1 1 II 1 1 1 1 1 1 
GGGCTGCAGCGC 333 




Db 


322 





RESULT 12 

AW476657 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AW476657 529 bp mRNA linear EST 24-FEB-2000 

uq79e01.yl NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 2937336 5 1 
similar to TR:O35073 O35073 NTAK ALPHA2-1P ;, mRNA sequence. 
AW476657 

AW476657. 1 GI: 7 04 6763 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 529) 

NCI-CGAP http : / /www. ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished (1997) 

Other_ESTs: uq79e01.xl 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r @mail . nih . gov 

Tissue Procurement: Gilbert Smith, Ph.D. 



cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Washington University Genome Sequencing Center 

Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

www-bio . llnl . gov/bbrp/image/ image . html 

MGI: 1049756 

Seq primer: -40RP from Gibco 
High quality sequence stop: 459. 
FEATURES Location/Qualifiers 
source 1. .529 

/organism="Mus musculus" 

/mol_type= n mRNA" 

/strain="Czech II" 

/db_xref="taxon: 10090" 

/clone="IMAGE: 2937336" 

/tissue_type="pooled lung tumors" 

/lab_host="DH10B (phage-resistant ) " 

/clone_lib="NCI_CGAP_Lu33" 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l : NotI; Site_2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5* 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] . 
Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares and M. Fatima Bonaldo. " 

ORIGIN 

Query Match 23.4%; Score 232.2; DB 10; Length 529; 

Best Local Similarity 93.1%; Pred. No. 4.6e-38; 

Matches 243; Conservative 0; Mismatches 18; Indels 0; Gaps 0; 

Qy 653 TCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACA 712 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 2 T C T GT GAG GC C GAGAAC AT CC T T GGGAAGGAC AC C GT GAGGG GC C GAC T C CAT GT CAAC A 61 

Qy 713 GCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCA 772 

I I I I M I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I | M | | | | | 
Db 62 G C GT GAGC ACC ACT CT GT CAT C C T GGT C GGGAC AT G C C C G GAAGT GCAAT GAGAC C GC C A 121 

Qy 773 AGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCT 832 

I I I I I I I II II I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 122 AGTCCTACTGTGTGAATGGAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCT 181 

Qy 833 GC AAAT GT C CAAAT GGAT T C T T C G GAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT 8 92 

I I I I I I I I II I II I I I I II I I I I I I II I I I I I | | | | | || | | M | | M II I I I II I I I I I 
Db 182 GCAAAT GT C C AAAC GGAT T CT T C GGAC AGAGAT GT T T GGAGAAAC T G C C T T T GC G AT T GT 241 

Qy 893 ACAT GC C AGAT C CT AAG CAAA 913 

I I I I I I I I I I I I I I I I I I I M 
Db 2 42 ACAT GC C AGAT C C T AAG CAAA 262 



AA772412 297 bp mRNA linear EST 31-DEC-1998 

ai44el2.sl Soares_parathyroid_tumor__NbHPA Homo sapiens cDNA clone 
1359886 3' similar to TR:P43328 P43328 NEU DIFFERENTIATION FACTOR 
NDF04 ;, mRNA sequence. 
AA772412 

AA772412.1 GI:2824195 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 297) 

NCI-CGAP http: //www.ncbi .nlm.nih.gov/ncicgap. 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 
cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/bbrp/image/image . html 

Possible reversed clone: similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 667 Std Error: 0.00 
Seq primer: -4 0ml3 fwd. ET from Amersham 
High quality sequence stop: 267. 
FEATURES Location/Qualifiers 
source 1. .297 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="1359886" 

/tissue_type="parathyroid tumor" 
/dev_stage="adult !l 

/lab_host="DH10B (ampicillin resistant)" 
/clone__lib="Soares_parathyroid_tumor NbHPA" 
/note="0rgan: parathyroid gland; Vector: pT7T3D 
(Pharmacia) with a modified polylinker; Site_l: Not I; 
Site_2: Eco RI ; 1st strand cDNA was primed with a Not I - 
oligo(dT) primer 

[5 1 -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded c DNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M. Fatima Bonaldo . RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 



RESULT 13 
AA772412 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



ORIGIN 



NIH. " 



Query Match 22.5%; Score 224; DB 9; Length 297; 

Best Local Similarity 92.2%; Pred. No. 1.8e-36; 

Matches 236; Conservative 0; Mismatches 20; Indels 0; Gaps 0; 



Qy 


405 


GAT C CT GT GC AC T GACT G C GCC AC C C GG C C C AAGT T GAAGAAGAT GAAGAG C C AGAC G GG 

1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GAGC C C GAT C C C GGGAGAAAGC C AC C C G GC CAAGT T GAAGAAGAT GAAGAG CC AGAC G GG 


464 


Db 


42 


101 


Qy 


465 


ACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTC 
1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 | | I | | | | | || | | | | | | | | | | | | | | | | | | | | | M | || | 
ACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCGGTGAATCCCCAGCCTTC 


524 


Db 


102 


161 


Qy 

Db 


525 
162 


C T AC C GT T G GT T CAAG GAT GG CAAG GAGC T C AAC C GCAGC C GAGAC AT T C G CAT CAAAT A 584 

1 1 1 M 1 1 1 M 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 II 1 II 1 1 | M | | | | | | || | | M | | || || | | || | 

C T AC CGTTGGTT CAAG GAT G G CAAG GAG C T C AAC C G C AG C C GAG AC AT T C G CAT CAAAT A 221 


Qy 


585 


TGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGG 
M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I M | I I | || | | || II 1 1 II 1 1 1 1 1 1 I I I I I | M 1 1 1 1 1 1 
TGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGG 


644 


Db 


222 


281 


Qy 


645 


GGAGTATGTCTGCGAG 660 
1 1 1 II 1 1 1 1 M 1 II II 
GGAGTATGTCTGCGAG 297 




Db 


282 





RESULT 14 

BX089049/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



Craniata; Vertebrata; Euteleos tomi ; 
Catarrhini; Hominidae; Homo. 

Neubert,P., Partsch,E., Peters, M., 



BX089049 362 bp mRNA linear EST 23-JAN-2003 

BX089049 Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
IMAGp998M133119 ; IMAGE : 1240116, mRNA sequence. 
BX089049 

BX089049. 1 GI : 27825909 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 362) 
Ebert,L., Heil,0., Hennig,S., 
Radelof,U., Schneider, D. and Korn,B. 
Human UnigeneSet - RZPD3 
Unpublished (2003) 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-6912 0 Heidelberg, Germany 
RZPD; IMAGp998M133119. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http: //www. rzpd.de/CloneCards/cgi- 

bin/showLib.pl.cgi/response?libNo=972 Contact: Ina Rolfs 
RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Heubnerweg 6, D-14059 Berlin, Germany 
Tel: +49 30 32639 101 
Fax: +49 30 32639 111 



www. rzpd . de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
M13r, Primer sequence: TTTCACACAGGAAACAGCTATGAC . 
FEATURES Location/Qualifiers 
source 1. .362 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 

/clone="IMAGp998M133119 ; IMAGE : 1240116" 
/tissue_type= "parathyroid tumor" 
/dev_stage="adult" 

/lab_host="DH10B (ampicillin resistant) " 
/clone_lib="Soares_parathyroid_tumor_NbHPA" 
/note="Organ: parathyroid gland; Vector: pT7T3D 
(Pharmacia) with a modified polylinker; Site_l: Not I; 
Site_2: Eco RI; 1st strand cDNA was primed with a Not I - 
oligo(dT) primer 

[5 ' -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia). Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

ORIGIN 

Query Match 18.1%; Score 180; DB 13; Length 362; 

Best Local Similarity 99.4%; Pred. No. 3.2e-27; 

Matches 180; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
QY 656 GCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCG 715 

I M I I I M I I I I I I I I I I I M II I I I I I I I I I M I I I I I I I I I | | | | | | | | M I M I I I 

Db 362 GCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGNCCGGCTTTACGTCAACAGCG 303 

QY 716 TGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGT 775 

I I I I I I I I I I I I I I I I M I I I II I I I I I I II I I I I I I I I I I I I || I | | | | | | | | 

Db 302 T GAGC AC C AC C CT GT CAT CCTGGTCGGGG C AC GC CC GGAAGT G CAAC GAGACAGC CAAGT 243 

Qy 776 CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 835 

I I M I II I I II I I I I I I I I I I I I I I I I | | M | | | || | | | | | | | | | || | | | | | | | | | | | | | 
Db 242 CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 183 

Qy 836 A 836 

I 

Db 182 A 182 



RESULT 15 
AW762061 

LOCUS AW762061 256 bp mRNA linear EST 04-MAY-2000 

DEFINITION ur53c01.yl NCI_CGAP_Mam3 Mus musculus cDNA clone IMAGE : 3153984 5' 

similar to TR:O35073 035073 NTAK ALPHA2-1P ;, mRNA sequence. 
ACCESSION AW762061 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AW762 061. 1 GI: 7693978 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 256) 

NCI-CGAP http : //www . ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished (1997) 

Other_ESTs: ur53c01.xl 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Lothar Hennighausen Ph.D., Chu-Xia Deng Ph.D. 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

image . llnl . gov/image/html/iresources . shtml 



MGI: 1056740 

Seq primer: -4 0RP from Gibco 
High quality sequence stop: 161. 
FEATURES Location/Qualifiers 
source 1. .256 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="129, C57BL/6J, FVB/N" 

/db_xref="taxon: 10090" 

/clone= n IMAGE: 3153984" 

/ tissue_type="tumor, gross tissue" 

/dev_stage="10 months" 

/lab_host="DH10B" 

/ clone_lib= "NCI_CGAP_Mam3 " 

/note="0rgan: mammary; Vector: pCMV-SP0RT6; Site_l : Sail; 
Site_2: NotI; Cloned unidirectionally . Primer: Oligo dT. 
Library constructed by Life Technologies. Investigators 
providing samples: Lothar Hennighausen/Chu-Xia Deng, NIH 
Reference for transgenic model: Xu et al . , Nature Genetics 
22, 37-43 (1999) . " 

ORIGIN 



Query Match 17.3%; Score 171.8; DB 10; Length 256; 

Best Local Similarity 93.7%; Pred. No. 1.4e-25; 

Matches 179; Conservative 0; Mismatches 12; Indels 0; Gaps 0; 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

III I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I M II I I I I I I I I I || 
Db 1 C AC T CT GT CAT CCTGGTCGG GAC AT G C C C GGAAGT G C AAT GAGAC C G C CAAGT C C TACT G 60 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | 1 I | | | | | | | | || | 
Db 61 TGTGAATGGAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTGC 120 



Qy 843 AAAT GGAT T CT T C G GAC AGAGAT GT T T GGAGAAAC TGCCTTTGC GAT T GT AC AT G C CAGA 902 



Db 121 AAAC G GAT T C T T C GGAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT ACAT G C CT G A 180 

QY 903 TCCTAAGCAAA 913 

I I I I I I I I I I I 
Db 181 TCCTAAGCAAA 191 



Search completed: August 15, 2004, 09:42:28 
Job time : 3164.99 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



August 15, 2004, 02:48:13 ; Search time 4242.5 Seconds 

(without alignments) 
10155.083 Million cell updates/sec 

US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3470272 seqs, 21671516995 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database 



GenEmbl : 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 



gb_ba : * 
gb_htg : * 
gb_in : * 
gb_om : * 
gb_ov: * 
gb_pat : * 
gb_ph : * 
gb_pl : * 
gb_p r : * 
gb_ro : * 
gb_sts : * 
gb_sy : * 
gb_un : * 
gb_vi : * 
em__ba : * 
em_f un : * 
em_hum: * 
em_in : * 
em_mu : * 
em_om : * 
em_o r : * 
em_ov : * 
em_pat : * 
em_ph : * 
em_pl : * 
em_ro : * 
em sts : * 



9 ft * 


em 


un : * 


9 Q • 


biti 


vi : * 




em 


V* "1— V> 1 i "m * *rC 

nxg nuiu . 


O X . 


em 


htg inv : * 


"39 • 


em 


htg other : * 


^ ■ 


em 


Vi +- /r T*m i c * "A* 

nuy mus . 


Id • 


em 


nr.g pxn . 


"3 R • 
jj. 


em 


Vi +- /-c ■ "A" 

ntg roa . 


"36 • 
O D . 


em 


nuy main. 


37: 


em 


htg_vrt : * 


38: 


em 


sy : * 


39: 


em 


htgo hum:* 


40: 


em 


htgo mus : * 


41: 


em 


htgo_other : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result Query 

No. Score Match Length DB ID Description 
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AK124504 


Homo sapi 
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ADnn c; n £ n 


AB005060 


Homo sapi 
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Sequence 


4 
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Sequence 


5 
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Sequence 


6 
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80 


.5 


2947 


10 


D89995 


D89995 


Rattus sp. 


7 
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3076 


6 
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E16456 Rat mRNA fo 


8 
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.4 


3077 


10 


D89996 


D89996 


Rattus sp. 


9 
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AR072052 
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Sequence 


10 
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AR098144 


AR098144 


Sequence 


11 
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55 
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1607 


6 
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Sequence 
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6 


AR098146 


AR098146 


Sequence 
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492 


49 
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1476 


6 


AR116618 


AR116618 


Sequence 
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49 
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6 


AR098155 


AR098155 


Sequence 
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49 


5 


2268 


6 


AR116627 
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Sequence 


16 
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49 


0 
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10 


AB001576 


AB001576 Rattus sp 


17 


467. 8 
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1 
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6 


AR098143 


AR098143 


Sequence 


18 


467.8 


47. 


1 


2467 


6 


AR116615 


AR116615 


Sequence 


19 


424.8 


42. 


7 


118504 


9 


AC094080 


AC094080 


Homo sapi 


20 


424.8 


42. 


7 


152838 


2 


AC011589 


AC011589 


Homo sapi 


21 


424 . 8 


42. 


7 


170797 


9 


AC011379 


AC011379 


Homo sapi 


22 


424.8 


42. 


7 


210675 


2 


AC026272 


AC026272 


Homo sapi 


23 


424 


42. 


7 
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6 


AX406616 


AX406616 


Sequence 


24 
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42. 


7 
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9 


HS2NRG01 


AF119151 


Homo sapi 


25 
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39. 


0 


139074 


2 


AC131191 


AC131191 


Mus muscu 


26 
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6 


253462 


2 
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AC096477 


Rattus no 


27 
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2 
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6 
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Sequence 


28 
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9 
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10 
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AY227 025 Mus muscu 


29 
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Sequence 


30 
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17. 
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9 
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Homo sapi 


31 
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17. 


4 
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9 


AC008523 


AC008523 


Homo sapi 


32 
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4 


189049 


9 


AC008667 


AC008667 


Homo sapi 


33 
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2 


AC020830 


AC020830 


Mus muscu 





34 
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3 
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AC127350 Mus muscu 




35 
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Sequence 




39 
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Sequence 




40 
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Homo sapi 




41 
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4 


12. 


3 
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6 


AX406619 


AX406619 


Sequence 




42 


122. 


4 


12. 


3 


350 


9 


HS2NRG04 


AF119154 


Homo sapi 


c 


43 


108. 


4 


10. 


9 


205280 


2 


BX323592 


BX323592 


Danio rer 




44 
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4 


10. 


9 


207840 


5 


BX005008 


BX005008 


Zebraf ish 


c 


45 


10 


8 
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9 
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2 


AC020830 


AC020830 


Mus muscu 
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RESULT 1 
AK124504 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



AK124504 2613 bp mRNA linear PRI 09-SEP-2003 

Homo sapiens cDNA FLJ42513 fis, clone BRACE2046295, highly similar 
to NTAK PROTEIN. 
AK124504 

AK124504 . 1 GI: 34530302 

oligo capping; fis (full insert sequence) . 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Kawakami,B., Sugiyama,A., Takemoto,M., Sugiyama,T., Irie,R., 
Otsuki,T., Sato,H., Wakamatsu, A. , Ishii,S., Yamamoto,J., Isono,Y., 
Kawai-Hio, Y. , Saito,K., Nishikawa, T . , Kimura,K., Yamashita,H. , 
Matsuo,K., Nakamura,Y., Sekine,M., Kikuchi,H., Kanda,K., 
Wagatsuma,M. , Murakawa,K., Kanehori,K., Takahashi-Fu j ii, A. , 
Oshima,A., Suzuki, Y. , Sugano,S., Nagahari,K., Masuho,Y., Nagai,K. 
and Isogai,T. 

NEDO human cDNA sequencing project 

Unpublished 

2 (bases 1 to 2613) 

Isogai,T. and Yamamoto, J. 

Direct Submission 

Submitted ( 15- JUL-2003 ) Takao Isogai, FLJ Project (HRI Team); 2-6-7 
Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan 

(E-mail: genomics@hri.co.jp, Tel : 81-438-52-3975, Fax:81-4 3 8-52-3986) 
NEDO human cDNA sequencing project supported by Ministry of 
Economy, Trade and Industry of Japan; cDNA full insert sequencing: 
Research Association for Biotechnology (RAB) ; cDNA library 
construction: Helix Research Institute (HRI) (supported by Japan 
Key Technology Center etc.); 5'- & 3 1 -end one pass sequencing: RAB, 
HRI, and Biotechnology Center, National Institute of Technology and 
Evaluation; clone selection for full insert sequencing: HRI and 
RAB; annotation: HRI and RAB. 

Location/Qualifiers 

1. .2613 

/organism="Homo sapiens' 1 
/mol type="mRNA ,f 



/db_xref="taxon: 9606" 
/clone="BRACE2 04 6295" 
/tissue_type="cerebellum" 
/clone_lib= ,, BRACE2" 
/note="cloning vector: pME18SFL3" 

ORIGIN 

Query Match 99.8%; Score 992.4; DB 9; Length 2613; 

Best Local Similarity 99.9%; Pred. No. 6.3e-187; 

Matches 993; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I N I I I I I I I I I I I I I I I I I I | | | | | M I I I I I I I I I I I I I I I I I | | | | I M I I I M I I 
Db 436 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTTGCTCGCCTGC 4 95 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I M I I M I I I II I I II I I I || I I I | | | | | | | | | | | | | | | | | | | || | | | | | | | | | | | | 
Db 4 96 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 555 

QY 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I M I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I M 

Db 556 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 615 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I M I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I II I I I I I I I I I M 

Db 616 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 675 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II II I I I I I I I I I I I I I I I I I | | | | | | | 
Db 67 6 GGGCTGCAGCGGGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 735 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I || I | I | | | | | | | | | | | | | | 
Db 73 6 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 795 

Qy 361 C C CCT CGATAC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACTGAC 420 

I I I I I I I I 1 I I I I I I I M I II I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 796 C C C C T C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT CCT GT G CACT G AC 855 

Qy 421 T GC GC CAC C C GGC C C AAGT T GAAGAAGAT GAAGAG C C AGAC GGGACAGGT G GGT GAGAAG 48 0 

I I I I I I M I I I I I I I I II II I I I I I I I I | || | | | | | | | | | | || | | || | | | | | | | | | | | | | 
Db 856 T GC G C CAC C C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGACAG GT G G GT GAGAAG 915 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I M M I I I I I I I 1 I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | || || | 
Db 916 CAATCGCTG7VAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 975 

Qy 541 GAT G GCAAG GAG C T CAAC C GC AGC C GAGAC ATT C GCAT CAAAT AT GG CAAC GG CAGAAAG 60 0 

I I M I I I I I I M I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I II I | | | | | | | | | | 
Db 976 GAT G GCAAGGAGCT CAAC C GC AGC C GAG ACATT C GCAT CAAAT AT GG CAAC G GC AGAAAG 1035 

Qy 601 AACT CAC GAC T ACAGT T CAACAAG GT GAAGGT G GAGGAC G CT G GGGAGT AT GT C T G C GAG 660 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I | | | | 1 | | | | | | | | | | | | | || | | | | | || | | | 
Db 1036 AAC T CAC GACT ACAGT T CAACAAG GT GAAG GT G GAG GAC G CT G G G GAGT AT GT CT G C GAG 1095 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I M I I I I M I I I I I I I I M I I I I I I I I I I I II || I I I I I I I I || | | | | | | | | | | 



Db 



109 6 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 1155 



QY 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

I I I I I M M I I I I M I I M I M I I I I I II I I I I I I I I I I I I I I I I I I I M I M I I I I I I I 
Db 1156 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 1215 

QY 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I M I I I I M I M I I I I I I I II I I I I I II I I | | | | | | | | | I I M I I I I 

Db 1216 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1275 

Qy 841 C CAAAT G GAT T CT T C GGAC AGAGAT GT T T GGAGAAAC TGCCTTTGC GAT T GT AC AT GC CA 900 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I | I 
Db 1276 CCAAATGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCA 1335 

Qy 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || 
Db 1336 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 1395 

Qy 961 T CAACT T CT C CAAG C AC CT T GGAT T T GAAT TAAA 994 

I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I 
Db 1396 TCAACTTCTCCAAGCACCTTGGATTTGAATTAAA 142 9 
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CDS 



AB005060 3020 bp mRNA linear PRI 14-NOV-1997 

Homo sapiens mRNA for NTAK, complete cds . 

AB005060 

AB005 060. 1 GI : 2626738 
NTAK. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (sites) 

Higashiyama,S. , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 3020) 
Ishiguro, H. 

Direct Submission 

Submitted ( 24- JUN-1997 ) Hiroshi Ishiguro, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 

(E-mail: hishi@fujita-hu.ac.jp, Tel : 0562-93-9393 , Fax : 0562-93-8 8 31 ) 
Location/Qualifiers 
1. .3020 

/ organism="Homo sapiens " 
/mol_type= M mRNA" 
/db_xref="taxon: 9606" 
/ eel ine= " SK-N- SH " 
/ cell_type="neuroblastoma " 
226. .2778 



Fujita Health University, 
470-11, Japan 



/codon__start=l 
/product="NTAK" 
/protein_id="BAA23417. 1" 
/db_xref="GI: 2626739" 

/ trans lation="MRQVCCSALPPPPLEKGRCSSYSDSSSSSSERSSSSSSSSSESG 
SSSRSSSNNSSISRPAAPPEPRPQQQPQPRSPAARRAAARSRAAAAGGMRRDPAPGFS 
MLLFGVS LAC YS PSLKSVQDQAYKAPAAA/EGKVQGLVPAGGS S SN ST REP PAS GRVAL 
VKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFAPLDTNG 
KNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKE 
LNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTT 
LSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 
DPKQKAEELYQKRVLTITGICVALLWGIVCWAYCKTKKQRKQMHNHLRQNMCPAHQ 
NRSLANGPSHPRLDPEEIQMADYISKNVPATDHVIRRETETTFSGSHSCSPSHHCSTA 
TPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSPACVEARARRAAAYNLE 
ERRRATAPPYHDSVDSLRDSPHSERYVSALTTPARLSPVDFHYSLATQVPTFEITSPN 
SAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPGPGPGPGADMQRSYDSYYYPAA 
GPGPRRGTCALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRARGASRRTSAGP 
RRWRRSRLNGLAAQRARAARDSLSLSSGSGGGSASASDDDADDADGALAAESTPFLGL 
RGAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRGPPPRAKQDSAPL" 
polyA_site 3020 

/note="39 A nucleotides" 

ORIGIN 

Query Match 91.9%; Score 913.2; DB 9; Length 3020; 

Best Local Similarity 99.7%; Pred. No. 3.4e-171; 

Matches 915; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 502 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 561 

QY 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

M I II I I I I I I I I I I II I I I I II I I I I II I I I I II I II I i I I I I I I I I I I I I I I I I I I I I 
Db 562 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 621 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 622 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 681 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 682 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 741 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M I II I I I I I I I I I I I I I I I I I I I I I I I I | M I I I I II I I I II I I I I II I I I I I I I I I I I 
Db 742 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 801 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I | | | | | I I I | 
Db 8 02 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 861 

Qy 3 61 C C C C T C GAT AC CAAC GGC AAAAAT C T CAAGAAAGAG GT G G G CAAGAT C C T GT GC AC T GAC 420 

M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 1 I I I I I 
Db 8 62 CCCCTCGATAC CAAC GGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGC ACT GAC 921 



Qy 



421 T G C G C C AC C C G GC C CAAGT T GAAGAAGAT GAAGAG C CAGAC GG GAC AG GT G G GT GAGAAG 480 
I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I M I I I 



Db 922 T GC GC CAC C C G GC C CAAGT T GAAGAAGAT GAAGAG C CAGAC G GGAC AG GT GG GT GAGAAG 981 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I M I I M I I I I I I I I I I I I I I M I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 982 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 1041 

Qy 541 GAT G G C AAG GAG CT CAAC C GC AG C C GAGAC AT T C G CAT CAAAT AT GGCAAC GGCAGAAAG 600 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I | | | | | | | | | | | | | | M | | | | | | | | 
Db 1042 GAT GG CAAG GAGC T CAAC C G C AGC C GAGAC AT T C G CAT CAAAT AT G GCAAC GG C AGAAAG 1101 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I M I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I I I I I I I I I I 
Db 1102 AAC T CAC GAC T ACAGT T C AACAAG GT GAAG GT G GAG GAC GCT G G G GAG TAT GT C T GC GAG 1161 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

M I II I I I I I I I I I I I I I I I I I II I I I I I I I I I | | | | | | | | | | | | | | || | | | | | | M | M 
Db 1162 G C C GAGAACAT C CT G GG GAAG GAC AC CGTCCGGGGCCGGCTT T AC GT CAAC AG C GT GAG C 1221 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I 
Db 1222 ACCACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCT AT 12 81 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I II I 
Db 1282 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1341 

Qy 841 C CAAAT GGATT CTT C GGACAGAGAT GTTT GGAGAAACT GC CT TT GCGAT T GTACATGCCA 900 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 1342 C CAAAT G GATT CT T C GGAC AGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT G C C A 14 01 

Qy 901 GAT C CT AAGCAAAGT GT C 918 

I I I I I I I I I I I I I II 
Db 1402 GAT C C T AAG C AAAAAG C C 1419 
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ORIGIN 



AR098145 1884 bp DNA linear 

Sequence 5 from patent US 6074841. 

AR098145 

AR098145. 1 GI: 12807402 

Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1884) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 5 13-JUN-2000; 

Location/Qualifiers 

1. .1884 

/ organism="unknown" 
/mol_type="unas signed DNA" 



PAT 14-FEB-2001 



Query Match 90.5%; Score 900; DB 6; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 1.4e-168; 



Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I M M I I I M I I I I I I I M I I I I I I I I I M I I I I I ! I I I I I I I I I I I I I I I I I I I I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I II I 

Db 278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I M I I II II I I I I I I I II I I II I I I I I | | | | | || | | | | | | | | | | | | | | | | | | | | | | | | 
Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I 

Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I 
Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I | | | 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 C C C CT C GAT AC CAACG G CAAAAAT C T CAAGAAAGAGGT GGGCAAGAT C CT GT G C ACT GAC 420 

I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I M I I I I I I I I II I 1 I I I I I M I I I I I 

Db 57 8 C C C C T - GAT AC CAAC G G CAAAAAT CT CAAGAAAGAG GT GGG CAAGAT C C T GT GC ACT G GC 636 

Qy 421 T G C GC C AC C C G G C C CAAGT T GAAGAAGAT GAAGAG C CAGAC GGGAC AGGT GGGT GAGAAG 480 

I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I || | | 
Db 637 T G C GC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GG G ACAGGT GGGT GAGAAG 696 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GAT G G C AAGGAG CT CAAC C G C AGC C GAGAC AT T C G CAT CAAAT AT GGCAAC GG C AGAAAG 600 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I II I I I I I I I II I I II I I I I I I I I 
Db 757 GAT GGCAAG GAG CT CAAC C GC AG C C GAGAC AT T C GC AT CAAAT AT GG CAAC G GCAGAAAG 816 

Qy 601 AACT CAC GAC T AC AGT T CAACAAG GT GAAGGT G GAG GAC GCT G G G GAGT AT GT C T G C GAG 660 

I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 817 AAC T CAC GAC T AC AGT T C AACAAGGT GAAGGT G GAGGAC GCT G GG GAGT AT GT CT G C GAG 876 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I 
Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I | I | | 
Db 937 AC CAC C C T GT CAT C C T GGT C GG GGC AC G C C C GGAAGT G CAAC GAGAC AG C CAAGT C CT AT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I | | | M I I I I I I I I I I I I I I 
Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 



Qy 841 CCAAATGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCA 900 

I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M | | | | I I I I I I | I I I 
Db 1057 C CAAAT G GAT T CT T C GGAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C A 1116 

Qy 901 GATCCTAAGCAAAGTGTCCT 92 0 

I I I I I I I I I I I I I III 
Db 1117 GAT C CTAAGCAAAAG C AC CT 1136 



RESULT 4 
AR116617 

LOCUS AR116617 1884 bp DNA linear PAT 16-MAY-2001 

DEFINITION Sequence 5 from patent US 6133423. 
ACCESSION AR116617 

VERSION AR116617.1 GI : 14096939 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 1884) 

AUTHORS Gearing, D. P. and Bus field, S . J . 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6133423-A 5 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .1884 

/organism= "unknown" 
/mol_type=" unas signed DNA" 

ORIGIN 

Query Match 90.5%; Score 900; DB 6; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 1.4e-168; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I M I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I M II I I I I I I I I I I I I I I I I ! I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I II I I I I 
Db 27 8 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | M | | | | | | | | | | | | | | | | | | | 
Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I 
Db 39 8 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I II M I I I I I I I I I || I I I I | | | | | | | | | | | | M I I I I I I II I I I I I I I I I I II 
Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I M I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I II I I I I I I || I I I | 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 



Qy 361 C C C C T C GAT AC CAAC G G C AAAAAT CT CAAGAAAGAG GT GGG CAAGAT C C T GT GC ACT GAC 42 0 

I I I I I I I I II II I I II I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I II I I I I I I I I 

Db 57 8 C C C C T - GAT AC CAAC G G CAAAAAT C T CAAGAAAGAGGT GGG CAAGAT C C T GT GC ACT GGC 636 

Qy 421 T G C G C CAC C C G G C C CAAGT T GAAGAAGAT GAAGAG C C AGAC G G GAC AG GT GG GT GAGAAG 460 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I || I | | | | | | | | | | | | | | | | | | M | | || | | 
Db 637 T GC GC CAC C C G G C C C AAGT T GAAGAAGAT GAAGAGC C AGAC G G GACAG GT GG GT GAGAAG 696 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GAT GGCAAG GAGCT CAAC C GCAG C C GAGAC ATT C GCAT CAAAT AT G G CAAC GG C AGAAAG 600 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I || 
Db 7 57 GAT G GCAAG GAG CT CAAC C GCAG C C GAGAC ATT C GCAT CAAAT AT G G CAAC GGC AGAAAG 816 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I II I I I || | | | | | | | | || | | | | | || | | | || || | | | | M | | | 

Db 817 AACT CAC GAC TAC AGT T CAAC AAGGT GAAG GT GGAGGAC G C T G G G GAGT AT GT C T GC GAG 876 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I M I I I I I I I I I I II I I I II I I I I I I I | | | | | | | || | || | || | | | | | | | M 
Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 937 AC CAC C C T GT CAT C CT G GT C G GGGC AC G C C C GGAAGT GCAAC GAGAC AG C CAAGT C C TAT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGC7VAATGT 840 

I M I I I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 C CAAAT G GAT T CTT C GGACAGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT G C C A 900 

M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | 
Db 1057 C CAAAT GGAT T C TT C GGACAGAGAT GT T T G GAGAAAC TGCCTTTGC GAT T GT AC AT GC C A 1116 

Qy 901 GATCCTAAGCAAAGTGTCCT 920 

I I I I I I I I I I II I III 
Db 1117 GAT C CT AAGCAAAAGC AC CT 113 6 



RESULT 5 
AR072053 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 



AR072053 
Sequence 
AR072053 
AR072053. 



993 bp DNA 
from patent US 5912326. 

GI:7222941 



linear 



PAT 18-FEB-2000 



Unknown . 

Unknown. 

Unclassified. 

1 (bases 1 to 993) 

Chang, H . 

Cerebellum-derived growth factors 
Patent: US 5912326-A 3 15-JUN-1999; 
Location/ Qualifiers 



source 
ORIGIN 



1. .993 

/ organism="unknown" 
/mol_type="unassigned DNA" 



Query Match 88.6%; Score 881; DB 6; Length 993; 

Best Local Similarity 93.0%; Pred. No. 8.5e-165; 

Matches 923; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | | | | | | | | | | | | | | | | M | | | | | | | | | | 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


1 


60 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I II I I I I I I || | | || 
TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


61 


120 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 II 1 1 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


180 


Db 


121 


180 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
1 1 1 1 1 1 1 M M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 II 1 1 1 || I || I I I I | 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


181 


240 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 
1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 Ml | | | | | | | | | | | 1 1 1 1 1 1 1 1 | M | | | | | | M 
GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


Db 


241 


300 


Qy 

Db 


301 
301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I || | | || | | | | | | | | | | | | | | | | | | | | | | | | 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 
360 


Qy 


361 


C C CC T C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCAC T GAC 
II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I || | | | | | | | | | | | | | | | || | | | | | 
CCGGT CGACCCTAACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT CCTGTGCACT GAC 


420 


Db 


361 


420 


Qy 


421 


T GC GC C AC C C G GC C CAAGT T GAAGAAGAT GAAG AGC CAGAC G GGAC AGGT G GGT GAGAAG 
M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I | Ml | | | | || | | I | I | | 
T GCGCAACC CGGC C CAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 


480 


Db 


421 


480 


Qv 


481 


CAAT C GCT GAAGT GT G AGGC AGC A GC CGCT A AT r r C C A C cr tt r r t a r r r^nm r rrw r t\ ar- 

II Mill 1 1 1 II 1 1 1 1 1 1 II II II II 1 I I I I I | | | | | | | || | | | | | | | | | 

CAGTCGCT CAAGT GTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


C A C\ 

o4 U 


Db 


481 


540 


Qy 


541 


GAT G GC AAG GAGC T CAAC C GC AGC C GAGAC AT T C G CAT CAAAT AT GG CAAC G GC AGAAAG 

M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 M 

GAC GG C AAGGAG CT CAAC C GGAGT C GT GAC AT T C GCAT CAAGT AT GG CAAC GG C AGAAAG 


600 


Db 


541 


600 


Qy 


601 


AACT CAC GAC TACAGTTCAACAAGGT GAAG GTGGAG GAC GCTGGGGAGTATGTCTGC GAG 
1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I | | | | | M 1 1 1 1 II 1 || I | | | | | | | | | | | | Ml 
AAC T CAC G G CT ACAGT T CAACAAAGT GAAGGT G GAG GAC GC T G GAG AGT AC GT C T GT GAG 


660 


Db 


601 


660 


Qy 


661 


GC C GAGAAC AT C CT G GG GAAG GAC AC CGTCCGGGGCCGGCTT T AC GT CAAC AG C GT GAG C 
M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 I I I II 1 1 1 1 1 1 1 1 1 I I I I I I | | | | | | | I I | 
GCT GAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGT GAGC 


720 


Db 


661 


720 



Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

M M I I I I I I I II I I I I I I I I I I 1 II I I I I I II I II I I I I I I I I I I I I I I I I I I I I 
Db 721 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 7 80 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I 
Db 781 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 84 0 

Qy 841 C CAAAT G GAT T CT T C GGACAGAGAT GT T T G GAGAAAC TGCCTTTGC GAT T GT AC AT GC C A 900 

I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I II I I I I II I I I I | | | | I I I I I | I I | 
Db 841 C CAAAC GGAT T C T T C G GAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C A 900 

Qy 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I M II I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I I | I I | | | | | | M I I I I I I I I 
Db 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

Qy 961 TCAACTTCTCCAAGCACCTTGGATTTGAATTAA 993 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 T CAAC T T CT CC AAGC AC C T T GGAT T T GAAT TAA 993 



RESULT 6 

D89995 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 
FEATURES 

source 



CDS 



D89995 2947 bp mRNA linear ROD 07-FEB-1999 

Rattus sp. mRNA for NTAK alphal, complete cds . 

D89995 

D89995.1 GI:2605629 
NTAK alphal. 
Rattus sp . 
Rattus sp . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 2947) 
Ishiguro, H . 

Direct Submission 

Submitted ( 21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 
(E-mail : hishi@f uj ita-hu . ac. jp, Tel: 05 62-93-9393, Fax: 0562-93-8 831) 
Sequence updated (28-Feb-1997 ) by: Hiroshi Ishiguro. 

Location/Qualifiers 

1. .2947 

/organism= M Rattus sp . " 

/mol_t ype= "mRNA" 

/db_xref= M taxon: 10118" 

/cell_line="PC12" 

/ cell_type="pheochromocytoma " 

79. .2685 



/ codon_start=l 
/product="NTAK alphal" 
/protein_id="BAA23344 . 1" 
/db_xref="GI : 2605630" 

/translation="MRQVCCSALPPPLEKARCSSYSYSDSSSSSSSNNSSSSTSSRSS 
SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 
AAAAGGMRRDPAPGSSMLLFGVS LACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGS 
SSN STREP PAS GRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 
QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 
AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 
KDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 
FGQRCLEKLPLRLYMPDPKQKHLGFELKEAEELYQKRVLTITGICVALLWGIVCWA 
YCKTKKQRRQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVPATDHV 
IRREAETTFSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSV 
GTSKCNSPACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVSALTTPA 
RLSPVDFHYSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGP 
GPGADMQRSYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQECAPP 
PPPRPRTRGASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDD 
ADDADGALAAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRG 
PPTRAKQDSGPL" 

ORIGIN 

Query Match 80.5%; Score 800; DB 10; Length 2947; 

Best Local Similarity 91.8%; Pred. No. le-148; 

Matches 845; Conservative 0; Mismatches 75; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 403 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 462 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I || I I I | | | | | | | | | | | | | | 
Db 4 63 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 522 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I III II II I I I 1 I I II I I I I I I I I I I I I I I I I I I I I 
Db 523 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 5 82 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | M I I I I I I 
Db 583 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 642 

Qy 241 GGGCT GCAGC GC GAGCAGGT GAT CAGC GT GGGCT CCT GT GT GC C GCT C GAAAGGAAC CAG 30 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I M I I I I I I I I I I I I I I I I I I M 
Db 64 3 GGGCT GCAGCGCGAGCAGGT GAT CAGCGT GGGCT CCT GCGCGCCGCTCGAAAGGAAC CAG 702 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I II I I I I I I I I I || I I I I I | I M I I I I I I I I I I I II I I I I I I 
Db 7 03 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 762 

Qy 3 61 C C C CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT G G GCAAGAT CCT GT G C AC T GAC 420 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 63 C C G GT C GAC C C T AACG GCAAAAAC AT CAAGAAAGAG GT G G GCAAGAT C C T GT GC AC T GAC 822 

Qy 421 T G C G C CAC C C GGC C CAAGT T GAAGAAGAT GAAGAG C CAG AC G G GACAGGT G GGT GAGAAG 480 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II III I I I I I I I I I I I I I 
Db 823 T G C GCAAC C C G G C C CAAGCT GAAGAAGAT GAAGAGT CAGAC AGGAGAGGT G G GC GAGAAG 882 



Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

II I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I M I I I I I | I I I 
Db 8 83 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 942 

QY 541 GAT G GCAAG GAGCT CAAC C GC AG C C GAGAC AT T C GCAT CAAATAT GG C AAC G GC AGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I I II I I I I I I I I II I I I I I I 
Db 943 GAC G GC AAGGAG CT CAAC C G GAGT C GT GAC AT T C G CAT CAAGT AT G G CAAC GGC AGAAAG 1002 

Qy 601 AAC T C AC GACTAC AGT T CAACAAG GT GAAGGT G GAGGAC GC T G GGGAGT AT GTCT G C GAG 660 

I I I I I II I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I II I II I I I I I I I I III 

Db 1003 AACT CACGGCTACAGTTCAACAAAGT GAAGGT GGAGGACGCT GGAGAGTACGTCT GT GAG 10 62 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I II I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I II 

Db 1063 GC T GAGAAC AT C C T T G G GAAG GAC ACT GT GAG GGGC C G G CT C CAT GT CAAC AGT GT GAG C 1122 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I || | || | | 

Db 1123 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGC CAAGT CCTAC 1182 

Qy 781 TGCGTC7\ATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

II M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I 

Db 1183 T GT GT GAAT GGAGGC GT GTGCT ACT ACAT CGAAGGCAT CAACCAACTCT C CT GCAAAT GT 1242 

Qy 841 CCAAAT G GATT CT T C G GACAGAGAT GT TT G GAGAAACT GC CT T T GC GAT T GT AC AT G C C A 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 124 3 C C AAAC GGATT C T T C GGAC AGAGAT GT TT GGAGAAACT GCCTTTGC GAT T GT AC AT GC C A 1302 

Qy 901 GATCCTAAGCAAAGT GT C CT 92 0 

I I I I I I I I I I I I I III 
Db 1303 GAT C C T AAG C AAAAG C AC C T 1322 



RESULT 7 

E16456 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Craniata; Vertebrata; Euteleos tomi ; 
Sciurognathi; Muridae; Murinae; 



E16456 3076 bp DNA linear PAT 28-JUL-1999 

Rat mRNA for neuregulin-like Transmembrane Activator for ErbB 
Kinases (NTAK) . 
E16456 

E16456. 1 GI:5711139 
JP 1998179166-A/l. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
Rattus . 

1 (bases 1 to 3076) 

Higashiyama, S . , Taniguchi, N . , Ishiguro,K. and Nagatsu,T. 
GENE ENCODING RECEPTOR TYPE TYROSINE-KINASE ERB B LIGAND AND ITS 
Patent: JP 1998179166-A 1 07-JUL-1998; 
HIGASHIYAMA SHIGEKI 
OS Rattus sp. (rat) 
PN JP 1998179166-A/l 
PD 07-JUL-1998 
PF 25-DEC-1996 JP 1996356998 

PI HIGASHIYAMA SHIGEKI, TANIGUCHI NAOYUKI, ISHIGURO KEIJI, PI 



NAGATSU TOSHIHARU 

PC C12N15/09,C07Kl4/705,C07K16/28, C12N5/10, C12N15/02, C12P21/02, 
PC C12P21/08, 

PC C12Ql/68,G01N33/53,G01N33/566//A61K31/70,A61K38/46,A6lK39/395, 
PC A61K48/00, 

PC C07H21/04, (C12N5/10, C12R1:91) , (C12P21/02, C12R1 : 91) ; CC 
strandedness : Double; 
CC topology: Linear; 

Key Location/Qualifiers 



FH 
FH 
FT 
FT 
FT 
FT 
FT 



FEATURES 

source 



ORIGIN 



source 1. .3076 

/organism= 1 Rattus sp . 1 
/cell_line=' PC12 ' 
CDS 232. .2814 

/product= 1 NTAK protein 1 
Location/ Qualifiers 
1. .3076 

/organism="Rattus sp . " 
/mol_type=" genomic DNA" 
/db xref="taxon: 10118" 



Query Match 80.4%; 
Best Local Similarity 92.2%; 
Matches 842; Conservative 



Score 799.4; DB 6; 
Pred. No. 1.3e-148; 
0; Mismatches 71; 



Length 3076; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



556 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I I I I I I I I I I I I I I I II I I II I II I III I I I I I I I I I I I II I I I I I I I II I I I I I I I 
ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 615 



Qy 



Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
616 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 675 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I III II I II I I I I I I I I I I I I I I I I I I 1 I 1 I I II I I 
67 6 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 735 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I II II I I I I II 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
736 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 795 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I II I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
7 96 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 855 

3 01 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I 

856 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 915 

361 C C C CT C GAT ACCAACG G CAAAAAT CT CAAGAAAGAGGT G GGCAAGAT C C T GT G CACT GAC 420 

II I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I 1 I I I I I I I I | I I I I 

916 C C G GT C GAC C CT AAC G G C AAAAAC AT CAAGAAAGAGGT G GG C AAGAT C C T GT G CAC T GAC 975 

421 T GC GCCAC CC GGCC CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAG 4 8 0 

I I I M I I M I I I I I I II II I I I I I I I I I I I I I I I I I II III I I I I II I I I I I I I 
97 6 T GC GCAAC CC GGCC CAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 1035 



Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I M I I I I I I I I I 
Db 1036 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 1095 

Qy 541 GAT G GC AAGGAGCT CAAC C G C AGC C GAG AC AT T C GCAT CAAAT AT GG C AAC G G C AGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I II I I I I I I I I I II I I I I I I I I I I I I I I II I I 
Db 1096 GAC GGCAAGGAG CT CAAC C G GAGT C GT GAC AT T C G CAT C AAGT AT GG CAAC G G C AGAAAG 1155 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I III 

Db 1156 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 1215 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1216 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 1275 

Qy 721 AC C AC C CT GT CAT C C T G GT C GG G GC AC GC C C G GAAGT GCAAC GAGAC AG C CAAGT C C TAT 7 80 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 127 6 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1335 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 8 40 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 1336 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 1395 

Qy 841 C CAAAT GGAT T C TT C GGACAGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT GC C A 9 00 

I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I II I I II I I I I I I I I I I I I I I 1 I I I I 
Db 1396 C CAAAC GGAT T C TT C GGACAGAGAT GT T T GGAGAAAC TGCCTTTGC GAT T GT AC AT GC C A 1455 

Qy 901 GAT C CT AAG CAAA 913 

I I I I I I I I I I I I I 
Db 14 5 6 GAT C CT AAG CAAA 14 68 



RESULT 8 

D89996 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



D89996 3077 bp mRNA linear ROD 07-FEB-1999 

Rattus sp. mRNA for NTAK alpha2, complete cds . 

D89996 

D89996.1 GI:2605631 
NTAK alpha2. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M. , Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N. , Nagatsu,T., Taniguchi,N. 
and Ishiguro / H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 3077) 
Ishiguro, H. 



TITLE 
JOURNAL 



FEATURES 

source 



CDS 



Direct Submission 

Submitted (21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 

ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 

( E-mail :hishi@fujita-hu.ac. jp, Tel : 0562-93-9393 , Fax : 0562-93-8 831 ) 

Location/Qualifiers 

1. .3077 

/organism="Rattus sp. " 

/mol_type="mRNA" 

/db_xref="taxon: 10118" 

/cell_line= ,, PC12 M 

/cell type="pheochromocytoma" 

233. 72815 

/ codon_start=l 

/ product="NTAK alpha2" 

/protein_id="BAA23345. 1" 

/db_xref="GI :2605632 n 

/translation="MRQVCCSALPPPLEKARCSSYSYSDSSSSSSSNNSSSSTSSRSS 
SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 
AAAAGGMRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGS 
SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 
QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 
AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 
KDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 
FGQRCLEKLPLRLYMPDPKQKAEELYQKRVLTITGICVALLWGIVCWAYCKTKKQR 
RQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVPATDHVIRREAETT 
FSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSP 
ACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVSALTTPARLSPVDFH 
YSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPGADMQR 
SYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRTR 
GASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDDADDADGAL 
AAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRGPPTRAKQD 
SGPL" 



ORIGIN 



Query Match 80.4%; 
Best Local Similarity 92.2%; 
Matches 842 ; Conservative 



Score 799.4; DB 10; 
Pred. No. 1.3e-148; 
0; Mismatches 71; 



Length 3077; 
Indels 0; Gaps 



0; 



Qy 



Db 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I II I I i I II I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
557 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 616 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

617 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 67 6 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

677 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 736 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 1 I I I I I II I I I I I I II I I I I I I II I I 

737 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 796 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I II I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I II I II I I I II I I I I I I I I 

797 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 856 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



301 



857 



361 



CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 



360 



916 



420 



C C CC T C GAT AC C AAC G GCAAAAAT C T C AAGAAAGAG GT G GG C AAGAT C C T GT GC AC T GAC 

II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I 
917 C C GGT C GAC C CT AAC GG CAAAAACAT C AAGAAAGAG GT GGG CAAGAT C C T GT GC ACT GAC 97 6 



421 



480 



T G C G CCAC C C GGC C CAAGT T GAAGAAGAT GAAGAG CCAGAC GG GAC AGGT GG GT GAGAAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III I I I I I I I I I I I II 
977 T GC GCAAC C C GG C C CAAG CT GAAGAAGAT GAAGAGT CAGAC AG GAGAG GT G GGC GAGAAG 1036 



481 



540 



CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II I I I II I I I I I I I I I I I II II II I I I M I I I I I I I I I I II I I I I II I I I 
1037 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 109 6 



541 



600 



GAT GGCAAGGAGCT CAAC CGCAGCCGAGAC AT T CGCATCAAATAT GGCAAC GGCAGAAAG 

II I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
1097 GAC G GC AAGGAG CT CAAC C G GAGT C GT GAC AT T CGC AT CAAGT AT GGCAAC G G CAGAAAG 1156 



601 



660 



AACT C AC GACT AC AGT T C AACAAG GT GAAGGT G GAGGAC GCT GG GGAGT AT GT C T GC GAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I III 
1157 AACT CACGGCTACAGTTCAACAAAGT GAAGGT GGAGGACGCTGGAGAGTACGTCTGT GAG 1216 



661 



720 



G C C GAGAAC AT C CT G G GGAAG GACAC C GT C C G GGGCC GGCT T TAC GT CAAC AG C GT GAG C 
II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 11 I I I I I I II I I I I I I I II 
1217 GCT GAGAAC AT C CT T GG GAAG GACAC T GT GAGG GGCC GGCT C CAT GT CAAC AGT GT GAGC 1276 

721 AC C AC C C T GT CAT CCTGGTCGGGG C AC GC C C GGAAGT GCAAC GAGAC AGC CAAGT C C TAT 7 80 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I 
1277 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1336 



781 



840 



TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 
II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I 
1337 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 1396 



841 



900 



C CAAAT GGAT T C TT C G GAC AGAGAT GT T T G GAGAAACT GCCTT T GC GAT T GT AC AT GC C A 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
1397 C CAAAC G GAT T CTT C GGAC AGAGAT GT T T G GAGAAACT G CCTT T G C GAT T GT AC AT G C C A 1456 



901 



913 



GAT C C T AAGC AAA 
I I I I I I I I I I I I I 
1457 GAT CCTAAGCAAA 1469 



RESULT 9 
AR072052 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 



AR072052 
Sequence 
AR072052 
AR072052. 



1 from patent 



34 41 bp DNA 
US 5912326. 



linear 



PAT 18-FEB-2000 



1 GI:7222940 



Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 3441) 



AUTHORS Chang, H. 

TITLE Cerebellum-derived growth factors 

JOURNAL Patent: US 5912326-A 1 15-JUN-1999; 
FEATURES Location/Qualifiers 
source 1. .3441 

/ organism= "unknown" 
/mol_type="unassigned DNA" 

ORIGIN 

Query Match 74.3%; Score 738.6; DB 6; Length 3441; 

Best Local Similarity 90.4%; Pred. No. 1.6e-136; 

Matches 789; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 180 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I 
Db 24 0 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 299 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I II I I I II I I I I I III II I I I I I I I I I I I I I I I I I I I II I I I II I I I I 
Db 300 G G C AAG GT AC AG GGACT G GC C C C G GC AGG C GGT T C CAGC T C T AAC AG C AC C C GAGAG C CT 359 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 360 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 419 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I 
Db 420 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 47 9 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 4 80 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 53 9 

Qy 361 C C C C T C GAT AC CAAC GG CAAAAAT C T CAAGAAAGAG GT GGGCAAGAT C C T GT G C ACT GAC 42 0 

II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 54 0 C C G GT C GAC CC T AAC GG CAAAAAC AT CAAGAAAGAG GT GGG CAAGAT C CT GT G CACT GAC 599 

Qy 421 T G C G C C AC C CG GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC G G GAC AGGT G GGT GAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III I I I I I I I I I I I I I 

Db 60 0 T GC GC AAC C CG GC C CAAGCT GAAGAAGAT GAAGAGT CAGAC AG GAGAGGT G GG C GAGAAG 659 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

II I I II I I I I I I I II I I I II II II II I I I I II I I I I I I I II I I I I I I I I I 

Db 660 CAGTCGCT CAAGT GTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 719 

Qy 541 GATGGCAAGGAGCT CAAC CGCAGCC GAGACATT C GCAT CAAATAT GGCAAC GGCAGAAAG 600 

II I I II I I I I I I I I I I II I II II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 72 0 GAC G G CAAG GAG CT CAAC CGGAGT C GT GAC AT T C G CAT CAAGT AT GG CAAC GGCAGAAAG 77 9 



Qy 

Db 



601 
780 



660 
839 



Qy 661 GC C GAGAAC AT C CT G G G GAAGGAC AC C GT C C GG GG C C G G C T T T AC GT CAACAG C GT G AGC 72 0 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 840 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 8 99 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I II I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 90 0 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 959 

Qy 7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

II II I I I I I I II I I I I I I I I I I I II I I.I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 9 60 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAGTGT 1019 

Qy 841 C CAAAT GGAT T CT T C GGAC AGAGAT GT T T G GAG 873 

II I I I I I III I I I I I I I I I 

Db 102 0 C CT GT GGGAT ACAC C GGGGACAGGT GT CAGCAG 1052 



RESULT 10 

AR098144 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



ORIGIN 



AR098144 1607 bp DNA 

Sequence 3 from patent US 6074841. 
AR098144 

AR098144.1 GI: 128 07401 



Unknown . 

Unknown. 

Unclassified. 

1 (bases 1 to 1607) 

Gearing, D. P. and Bus field, S . J . 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 3 13-JUN-2000; 

Location/ Qualifiers 

1. .1607 

/ organism= "unknown" 
/mol_type="unas signed DNA" 



linear PAT 14-FEB-2001 



Query Match 55.1%; 
Best Local Similarity 92.3%; 
Matches 57 6; Conservative 



Score 547.2; DB 6; 
Pred. No. 1.6e-98; 
0; Mismatches 48; 



Length 1607; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



371 C CAAC G GCAAAAAT C T C AAGAAAG AGGT G GGCAAGAT CC T GT GC AC T GACT G C GC CAC C C 430 
I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I II I I I I I 
2 CT AAC G GCAAAAAC AT C AAGAAAGAGGT G GGCAAGAT CC T GT GC AC T GACT GC GC CAC C C 61 



Qy 



Db 



431 G GC C CAAGTT GAAGAAGAT GAAGAGC C AGAC G GGAC AGGT GG GT GAGAAG CAAT C G CT GA 490 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I II I I I I I I I I I I I I I 
62 GGC C CAAGCT GAAGAAGAT GAAGAGCC AGACAGGAGAGGT GGGT GAGAAGCAGT CGCT CA 121 



Qy 



Db 



4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I 
122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 



Qy 

Db 



551 AG CT C AAC CG C AGC C GAGAC AT T C G CAT CAAAT AT GGCAAC G G CAGAAAGAAC T CAC GAC 610 

I I I I I I I I I II II II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
182 AAC T CAAC CG GAGT C GT GAT AT T C G CAT CAAGT AT GGCAAT GT CAGAAAGAAC T CAC GGC 241 



Qy 

Db 



611 
242 



670 
301 



Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I II I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3 02 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CAT CCTGGTCGGGG C AC G C C C GGAAGT G CAAC GAGACAG C CAAGT C CT AT T G C GT CAAT G 790 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I Mill I I I I I I I I II I II II I I I I 
Db 362 CAT CCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGC CAAGT CCTACTGTGTGAATG 421 

Qy 7 91 GAG G C GT C T G CT ACT AC AT C GAG GG CAT CAAC C AG CT CT C C T GCAAAT GT C CAAAT GGAT 8 50 

i I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I | | | | | || M I I 
Db 422 GAG GC GT GT GCT AC T AC AT C GAGG GCAT CAAC C AG CT CT C CT G CAAAT GT C CAAAC GGAT 4 81 

Qy 851 T CT T C GGAC AGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT GC C AGAT C C T AAG C 910 

I II I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 482 T CT T C G GAC AGAGAT GT T T GGAGAAACT GC CT T T G CGAT T GT ACAT G C C AGAT C CT AAG C 541 

Qy 911 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 970 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I 

Db 542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

Qy 971 C AAG C AC C T T G GAT T T G AAT T AAA 994 

II I I I I I I I I I I I I II I I I II II 

Db 602 CAAGCAC CT T GGAT T T GAAT T GAA 625 



RESULT 11 

AR116616 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



ORIGIN 



AR116616 1607 bp DNA 

Sequence 3 from patent US 6133423. 
AR116616 

AR116616. 1 GI: 14 096938 



Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 1607) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6133423-A 3 17-OCT-2000; 

Location/Qualifiers 

1. .1607 

/ organism= "unknown" 
/mol_type="unas signed DNA" 



linear PAT 16-MAY-2001 



Query Match 55.1%; Score 547.2; DB 6; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 1.6e-98; 

Matches 57 6; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I 
Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



431 



62 



491 



791 



422 



851 



482 



911 



G GC C CAAGT T GAAGAAGAT GAAG AG C CAGAC G GGACAG GT G G GT GAGAAG CAAT C GC T GA 4 90 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I III I II I I I I I I I I I I I II Mill I 

GG C C CAAGCT GAAGAAGAT GAAGAG CC AGAC AGGAGAGGT GGGT GAGAAG C AGT C GC T C A 12 1 



AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 
I I I I I I I I I I II I II II M I I I I I I II II I II II I I I I I I I I I I I I I I I I II I 
122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 



551 



182 



611 



AG C T CAAC C G C AGC C GAGACAT T C G CAT C AAAT AT G GCAAC G GC AGAAAGAACT C AC GAC 
I I I I I I II I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AACT CAAC C GGAGT C GT GAT AT T C G CAT CAAGT AT GG CAAT GT C AGAAAGAACT C AC GG C 



T AC AGT T CAACAAG GT GAAGGT G GAGGAC GC T GG GGAGT AT GT CT G C GAG G C C GAGAAC A 
I I I I I I I I I I I I I I I I I II I I I I II I II I I I I I I I I I I I I I I I I I I II II I I I I 
2 42 T AC AGT T CAACAAAGT GAGG GT GGAGGAT G C C G GGGAGT AC GT C T GT G AGG C CGAGAAC A 



550 



181 



610 



241 



670 



301 



67 1 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I 
302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

731 CAT CCTGGTCGGGG C AC GCC C G GAAGT GCAAC GAGACAG C CAAGT CCT AT T GCGT CAAT G 790 

I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I II I I I I I I I I I I I II II I I I I 
3 62 CAT CCT GGT CG GGACAT G C C C GGAAGT G CAAT GAGAC C GC CAAGT C C TACT GT GT GAAT G 421 



GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 
I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
GAGGCGTGTGCTACTACATCGAGGGCAT CAAC CAGCTCT CCT GCAAATGTCCAAACGGAT 



850 



481 



910 



T C T T C G GAC AGAGAT GT T T G GAGAAAC TGCCTTTGC GAT T GT AC AT G C CAGAT C CTAAGC 
I I II I I I I I I II I I 1 I I I I I I I I I I I I I I I I I I I I II I I I I I I I 1 I I II I I I I I I I I I II 
T CT T C GGAC AGAGAT GT T T GGAGAAAC T GC CT T T GC GAT T GT AC AT GC CAGAT C CTAAG C 541 



AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 970 

I I I II I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I II I 
542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

971 CAAG C AC CT T GGAT T T GAAT T AAA 994 

I II I I I I I I I I I I I I I I I I I I II 
602 CAAG C AC CT T G GAT T T GAAT T GAA 62 5 



RESULT 12 

AR098146 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



AR098146 
Sequence 7 
AR098146 
AR098146. 1 



1476 bp 
from patent US 6074841. 

GI: 12807403 



DNA 



linear 



PAT 14-FEB-2001 



Unknown . 

Unknown. 

Unclassified. 

1 (bases 1 to 1476) 

Gearing, D. P. and Bus field, S.J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 7 13-JUN-2000; 

Location/Qualifiers 

1. .1476 



/ organism=" unknown 11 

/mol type= "unas signed DNA" 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 1.5e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C C T C GAT AC CAAC GGCAAAAAT C T CAAGAAAGAGGT G G GCAAGAT C C T GT G C AC T GACT G 422 

I I I I I I I III I I I I I I I I I I I I II 

Db 98 C C GC GGCAAGAAG C AC C C AGAGGGGAG GAAG C G G GAGAGG GAGC C C GAT C C C GG GGAGAA 157 

Qy 42 3 C G C C AC C C GG C C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT G GGT GAGAAG C A 4 82 

II I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 
Db 158 AGC C AC C C GGC C CAAGT T GAAGAAGAT GAAG AG C C AGAC G G GAC AG GT GG GT GAGAAGC A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I II I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 54 3 T GGCAAGGAG CT CAAC C GC AGC C GAGACAT T C GC AT CAAAT AT GG CAAC GG C AGAAAGAA 602 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I II 
Db 27 8 T GG CAAGGAGC T CAAC C GC AGC C GAGACAT T C GC AT CAAAT AT G GC AAC GGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II 
Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 518 C GT CAAT GGAGGCGTCTGCTACTACATC GAG GG CAT CAAC CAGCTCTCCTG CAAAT GTCC 577 

Qy 843 AAAT G GAT T CT T C GGAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GCC AGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 57 8 AAAT G GAT T CT T C G GAC AGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT G C C AGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I II I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 13 
AR116618 

LOCUS AR116618 1476 bp DNA linear PAT 16-MAY-2001 

DEFINITION Sequence 7 from patent US 6133423, 
ACCESSION AR116618 

VERSION AR116618.1 GI:14096940 

KEYWORDS 

SOURCE Unknown. 



ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 1476) 

AUTHORS Gearing, D. P. and Busf ield, S.J. 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6133423-A 7 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .1476 

/organism=" unknown" 
/mol_type="unas signed DNA" 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 1.5e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GAT AC CAAC GG CAAAAAT CT CAAGAAAGAGGT G GG CAAGAT C CT GT G C ACT GACT G 422 

I I I 1 I I I III I I I I I I I I I I I I I 

Db 98 CC GC G GCAAGAAG C ACC C AGAG GG GAGGAAGC GG GAGAGGGAGC C C GAT C C C GGG GAGAA 157 

Qy 423 C GC C AC C C G GCC CAAGT T GAAGAAG AT GAAGAG C C AGAC G G GAC AGGT GG GT GAGAAGC A 482 

I I I I I I I I II I M I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 158 AGC C ACC C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAG GT GGGT GAGAAGC A 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 2 77 

Qy 543 T GGCAAGGAGC T CAAC C GC AGC C GAGACAT T C GC AT CAAAT AT GG CAAC G G C AGAAAGAA 602 

I I I I I I I I I I I II I I II I I I I I II II I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I 
Db 278 T GGCAAG GAGCT CAAC C GCAGC C GAGACAT T C G CAT CAAAT AT G GCAAC GG C AGAAAGAA 337 

Qy 603 CT CAC GACT AC AGT T CAACAAGGT GAAGGT G GAGGAC GC T GG GGAGT AT GT C T GC GAG GC 662 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I II I I I I 
Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

1 I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 398 CGAGAAC AT C CT G G G GAAGGACAC C GT CCGGGGCCGGCTT T AC GT CAAC AG C GT GAGC AC 457 

Qy 723 CACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCTATTG 782 

I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 458 CACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGT GCAAC GAGACAGC CAAGT CCTATTG 517 

Qy 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT G GAT T C T T C G GAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT G C CAGA 902 

II I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 1 I I I 

Db 57 8 AAAT G GAT T CT T C G GAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC CAGA 637 

Qy 9 03 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 14 
AR098155 

LOCUS AR098155 2268 bp DNA linear PAT 14-FEB-2001 

DEFINITION Sequence 31 from patent US 6074841. 
ACCESSION AR098155 

VERSION AR098155.1 GI:12807412 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 2268) 

AUTHORS Gearing, D. P. and Bus field, S . J . 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6074841-A 31 13-JUN-2000; 
FEATURES Location/Qualifiers 
source 1. .2268 

/ organism="unknown" 
/mol_type="unas signed DNA" 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 1.5e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GATACCAACGGCAAAAAT CT CAAGAAAGAGGTGGGCAAGAT C CT GT GCACT GACT G 422 

I I I I I I I 111 I I I I I 1 I I I I I I II 

Db 98 C C G C GG CAAGAAGCAC C C AGAG GG GAG GAAG C GGGAGAGG GAGC C C GAT C C C GGGGAGAA 157 

Qy 423 C G C CAC C C GGCC CAAGT T GAAGAAGAT GAAGAGCCAGAC G G GAC AGGT GG GT GAGAAGC A 482 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 158 AG C CAC C C GGCC CAAGT T GAAGAAGAT GAAGAG CCAGAC G GGACAGGT GGGT GAGAAG C A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T GGCAAGGAGCT CAAC C G C AGC C GAGAC AT T C GC AT CAAAT AT GG CAAC G GCAGAAAGAA 602 

I I I I I I I I II I I I I I I I II I I I I II I I I I I I I I I II I I I I I I I I I II I I t I I II I I I I I I 
Db 278 T GG CAAG GAG CT CAAC C G C AGC C GAGAC AT T C GCAT CAAAT AT GG CAAC G GCAGAAAGAA 337 

Qy 603 CT CAC GAC T AC AGT T CAACAAG GT GAAGGT G GAG GAC GC T G GG GAGT AT GT CT G C GAG G C 662 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 
Db 338 CT CAC GACT ACAGTTCAACAAGGT GAAGGT GGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 C GAGAACAT C CT GGG GAAGGAC AC CGTCCGGGGCC GG CT TT AC GT CAAC AG C GT GAG CAC 722 

I I I I I I I I I I I M I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 398 C GAGAACAT CCTGGGGAAGGACACCGTCCGGGGCCGGCTTT AC GTCAACAGCGTGAGCAC 4 57 

Qy 72 3 CAC C C T GT CAT C CT G GT C GG G GCAC G C C C G GAAGT G CAAC GAGAC AGC CAAGT C CT AT T G 782 

I I I I I I I I I I I I II I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 45 8 CAC C C T GT CAT C CT G GT C GG G GC AC G C C C G GAAGT GCAAC GAGAC AGC CAAGT C CT AT T G 517 

Qy 7 83 C GT CAAT G GAGG CGT C T GCT ACT AC AT C GAG G GCAT CAAC C AG CTCTCCTG CAAAT GT C C 842 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 518 CGT CAAT G GAGG CGT C T GCT ACT AC AT C GAG GGCAT CAAC C AG CTCTCCTG CAAAT GT C C 577 



Qy 



843 AAAT G GAT T C T T CG G AC AG AGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C AGA 902 



Db 57 8 AAAT G GAT T C T T C G GAC AGAG AT GT T T G GAGAAACT G C C T T T G C GAT T GT AC AT G C CAG A 637 

Qy 9 03 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I II 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 15 
AR116627 

LOCUS AR116627 2268 bp DNA linear PAT 16-MAY-2001 

DEFINITION Sequence 31 from patent US 6133423. 
ACCESSION AR1 16627 

VERSION AR116627.1 GI: 14096949 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified . 
REFERENCE 1 (bases 1 to 2268) 

AUTHORS Gearing, D. P. and Busf ield, S . J. 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6133423-A 31 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .2268 

/organism= "unknown" 
/mol_type="unas signed DNA" 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 1.5e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C C T C GAT AC CAACGGC AAAAAT C T CAAGAAAGAGGT GGG CAAGAT C C T GT G C ACT GACT G 422 

I I I I I I I III I I I I I I I I I I I I II 

Db 98 C C GC GGCAAGAAGCAC C C AGAGGGGAGGAAG C G GGAGAGGGAGCC C GAT C C C GG GGAGAA 157 

Qy 423 C G C C AC C C G G CC CAAGTT GAAGAAGAT GAAGAGC CAGAC GG GACAGGT GG GT G AGAAG C A 4 82 

I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 158 AG C C AC C C G G C C CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT G AGAAG C A 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T G GCAAG GAG CT C AAC C G CAG C C GAGAC AT T C GC AT CAAAT AT GG CAAC G G C AGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 278 T G GCAAG GAG CT CAAC C G CAG C C GAGAC AT T C GC AT CAAAT AT GGCAAC GG C AGAAAGAA 337 

Qy 603 CT C AC GACT ACAGT T CAAC AAG GT GAAG GT GGAGGAC G C T G GGGAGT AT GT C T G C GAGG C 662 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 338 CT C AC GACT AC AGT T CAACAAG GT GAAG GT GGAGGAC G C T G GGGAGT AT GT CT G C GAGG C 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 



QY 



723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 7 82 



1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 518 CGT CAAT GGAG G C GT CT G CT ACT AC AT CGAGG G CAT CAAC CAG CT CT C C T G CAAAT GT C C 577 

Qy 843 AAAT GGATT CT T CGGACAGAGAT GTT T GGAGAAACT GCCT TT GCGAT T GT ACAT GCCAGA 902 

I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 578 AAATGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 
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Job time : 4249.5 sees 
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ALIGNMENTS 



RESULT 1 
AAS18019 

ID AAS18019 standard; cDNA; 994 BP. 
XX 

AC AAS18019; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2alpha, NRG-2alpha. 
XX 

KW Human; ss; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

KW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 



KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer 1 s disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .993 

FT /*tag= a 

FT /product= H NRG-2alpha" 

XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2 001WO-US016896 . 
XX 

PR 23-MAY-2000; 2000US-0206495P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11635. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell. 
XX 

PS Claim 57; Fig 6; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG-2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease), ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting smooth muscle 

CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson f s disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 



CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 

CC mitogenesis. The present sequence encodes NRG-2alpha 
XX 

SQ Sequence 994 BP; 230 A; 279 C; 304 G; 181 T; 0 U; 0 Other; 



Query Match 100.0%; Score 994; DB 6; Length 994; 

Best Local Similarity 100.0%; Pred. No. 4.8e-230; 

Matches 994; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

M I I I II I I I I I I I I I i I I I I i I I I I I I I I 1 I I I I I | | | | | | | | | | | || M | | | | | | | | | 
Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I II I II I I II I II I I I I I I M I I I I I M I I I I I I I I II 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I M I I I I M I I I I I I I I I I I I I I I I I || I I || I I | | | | | || | | | | || | | || | | | | | 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 3 60 

I M I I II I I I I I I I II I I I I I I I I I I I I I | | | | | | | | | | | | || | | | | | | | | | | | | | || | | 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 C C C CT C GAT AC CAAC GGCAAAAAT C T CAAGAAAGAGGT GG G CAAGAT C CT GT G CACT GAC 420 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 361 C C C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT G GGCAAGAT C C T GT GCACT GAC 420 

Qy 421 T GC GC CAC CC GGC C C AAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAG GT GGGT GAGAAG 480 

I I I I I I I I I I I I I M I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I | 
Db 421 T GC GC C AC CC GGC C CAAGT T GAAGAAGAT GAAGAG C CAGAC GGGAC AGGT GGGT GAGAAG 480 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I M I M I I I I II I I I I I I I I I I I I || I || | | I | | | | | | || | | | | | | | | | | | | 
Db 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

Qy 541 GAT GGC AAGGAGCT CAACC GC AG C C GAGACAT T C GC AT CAAAT AT G G CAAC GG C AGAAAG 600 

M I I I I I I II I II I I I I I I I I I I I II I I || I I M I | M | | | | | | | | | | || | | | | | | | | M 
Db 541 GAT GGCAAGGAGC T CAAC C GC AGC C GAGACAT T C GCAT CAAAT AT GGCAAC G GC AGAAAG 600 

Qy 601 AACT CAC GAC TACAGT T CAACAAGGT GAAGGT G GAGGAC G C T G GGGAGT AT GT C T G C GAG 660 

I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 601 AAC T CAC GACT AC AGT T C AACAAG GT GAAG GT GGAGGAC G CT G G GGAGT AT GT CT G C GAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I | || I II 
Db 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 



Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

I M I I I I I I I I I M I | I | | I I I I I I I I I I I I I I I I I I I I | | I | M I I I I I II I I I I I I I I 
Db 721 AC CAC C C T GT CAT CCTGGTCGGGG CAC G C CC GGAAGT G CAAC GAGAC AG C CAAGT C CT AT 78 0 

QY 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I M I I I I I I I I I I I I I I II I I I I I I I | | I | || I I I I I I I I I I I I I I || I I I I I I I II 
Db 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

Qy 841 C C AAAT G GAT T C T T C GGAC AGAGAT GT T T GGAGAAACT GC CT T T G C GAT T GT AC AT GC C A 900 

M I I I I I I I I I I II I I I I I I I I I I I I I I I | | | || | | | | | | | | | | | | || | | | | | | | || | | | 
Db 841 C CAAAT G GAT T C T T C G GAC AGAGAT GT T T G G AGAAACT GC CT T T GC GAT T GT ACAT GC C A 900 

Qy 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I M I I I I I I I I I I I I I || I I I I I I I I I I I I I I || I I M I II I I I I I II I I I I I I I I I I 
Db 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

Qy 961 TCAACTTCTCCAAGCACCTTGGATTTGAATTAAA 994 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 961 T CAACT T CT C CAAGCAC CT T GGAT TT GAAT T AAA 994 



RESULT 2 
AAV17814 

ID AAV17814 standard; cDNA; 1884 BP. 
XX 

AC AAV17814; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
KW wound healing; transmembrane; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT CDS 664. .1884 

FT /*tag= a 

FT /note= "don-1 polypeptide" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 
PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 



DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48381. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 4; Fig 3; 121pp; English. 
XX 

CC The sequence is that of a human don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 

XX 

SQ Sequence 1884 BP; 426 A; 607 C; 560 G; 291 T; 0 U; 0 Other; 



Query Match 90.5%; Score 900; DB 2; Length 18 84; 

Best Local Similarity 99.3%; Pred. No. 2.8e-207; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I M I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I II I I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG'C 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I M I M I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I II I II I II I I I I I I I I I I I I 
Db 278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I M I I I I I I I I I I I I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I | 
Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

M I I I M I I I I ! I I I I I I I I I I I I M I I I I I I I | | I I I I I I I I I I I I I I I I I I II I M I I 
Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I M I I I I I I I I I I I I II I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I 

Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I I I I I I I 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 C C C CT C GAT AC CAAC GG CAAAAAT C T CAAGAAAGAGGT G G GCAAGAT C C T GT GC AC T G AC 420 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | | | | | | M I I I I II I 
Db 578 CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

Qy 421 T GC G C CAC C CGGC C CAAGT T GAAGAAGAT GAAGAG C CAGAC GGGAC AGGT GGGT GAGAAG 4 80 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M | I II I I I I I 
Db 637 T GCGCCAC C CGGCC CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAG 696 



Qy 



4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 



Db 


697 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


756 


Qy 


541 


GAT G G CAAG GAG C T C AAC C GC AGC C GAG AC AT T C GCAT CAAAT AT GG CAAC GGCAGAAAG 
1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 I | | | | | | | | | | | | | | | | | | | | 
GAT G G CAAG GAGC T CAAC C GC AGC C GAGAC AT T C GC AT CAAAT AT GGCAAC GGCAGAAAG 


600 


Db 


757 


816 


Qy 


601 


AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 
1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 II t II 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | | | | | | | | | | | | | | | 
AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 


660 


Db 


817 


876 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 II 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 
GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 


Db 


877 


936 


Qy 

Db 


721 
937 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 M M 1 1 1 1 II 
AC CAC C CT GT CAT C CT G GT C GGG GC AC G C C C G GAAGT GCAAC GAGAC AGC CAAGT C C TAT 


780 
996 


Qy 


781 


T G CGT CAAT G GAGG C GT C T GCT AC T AC AT C GAGG G CAT CAAC CAGCT C T C C T GCAAAT GT 

1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 M II 1 II 1 1 1 1 II 1 1 

TGCGT CAAT GGAGGCGTCTGCTACTACATCGAGGGCAT CAAC CAGCT CTCCT GCAAAT GT 


840 


Db 


997 


1056 


Ov 
Db 


841 
1057 


CCAAATGGATTCTTCCi(^Ar*AfiA^ATnTTTr^^ArZA A attc^pt^ 

^\ji-inni \j vjii j. ± x -L will \j^J^^J~\r\r\\^ 1 uL\j 111 bn.1 ± \j 1 s\\^-s\± uLLn 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M || I I I | || | | || | | | | | | | | || | 

C CAAAT G GAT T CT T C GGAC AGAGAT GT T T GGAGAAACT GC C T T T GC GAT T GT AC AT GC C A 


q n a 

1116 


Qy 


901 


GAT C CTAAG C AAAGT GT C CT 92 0 
1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 
GATCCTAAGCAAAAGCACCT 1136 




Db 


1117 





RESULT 3 
AAT87923 

ID AAT87923 standard; cDNA; 1803 BP. 
XX 

AC AAT87923; 
XX 

DT 18-DEC-1997 (first entry) 
XX 

DE Rat cerebellum derived growth factor 2 cDNA. 
XX 

KW Rat; cerebellum derived growth factor; CDGF2; screening; binding; 

KW modulation; erbB type receptor; identification; indication; risk; 

KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment ;. injury ; 

KW trauma; ischaemia; ischemia; stroke; infection; disorder; inflammation; 

KW neurodegeneration; disease; Parkinson's; Huntingdon's; 

KW amylotrophic lateral sclerosis; sensory; retina; 

KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma; neuroectodermal tumour; ds . 

XX 

OS Rattus rattus. 
XX 

FH Key Location/Qualifiers 



FT CDS 1. .9 93 

FT /*tag= a 

FT sig_peptide 1. .69 

FT /*tag= b 

FT mat_peptide 70. .990 

FT /*tag= c 

FT /product= f, cerebellum_derived_growth factor" 

XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997 . 
XX 

PF 09-SEP-1996; 96WO-US0144 84 . 
XX 

PR 08-SEP-1995; 95US-0 05258 64 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR P-PSDB; AAW27537. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 17; Page 70-74; 94pp; English. 
XX 

CC The present sequence encodes rat cerebellum derived growth factor 2 

CC (CDGF2), which can be used to screen for modulators of CDGF binding to 

CC erbB type receptors. Identification of a modification or mutation in a 

CC CDGF gene, or aberrant expression of a CDGF gene or levels of soluble 

CC CDGF may be used to indicate the risk of unwanted cell proliferation or 

CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson's 

CC and Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblas tomas and 

CC neuroectodermal tumours 

XX 

SQ Sequence 1803 BP; 408 A; 549 C; 537 G; 309 T; 0 U; 0 Other; 



Query Match 88.7%; Score 882; DB 2; Length 1803; 

Best Local Similarity 93.0%; Pred. No. 6.2e-203; 



Matches 924; Conservative 0; Mismatches 70; Indels 



0; Gaps 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


1 


60 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 M 1 1 1 1 1 1 1 I | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | M | | | | || 
TACT CGCCCAGCCT CAAGT CCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


61 


120 


Qy 

Db 


121 
12 1 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 
1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 III II 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I I I I I | | 1 1 1 
GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


180 
180 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
1 1 1 1 1 M 1 1 1 II 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I II 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


181 


240 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 
1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 || | | | | | | M 
GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


Db 


241 


300 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 1 1 1 1 1 1 1 M II 1 1 II 1 1 1 1 1 1 1 1 1 1 II I | | M | | | | | | | | | | || | | | | | | | | | M 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 


Db 


301 


360 


Qy 


361 


C C C CT C GAT AC CAAC GGCAAAAAT CT C AAGAAAGAG GT G GG CAAGAT C CT GT G CACT GAC 
II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | | | M || | | | | | | | | | | | 
C C G GT C GAC C C TAAC GGCAAAAACAT CAAGAAAGAG GT GG GC AAGAT C CT GT G CACT GAC 


420 


Db 


361 


420 


Qy 


421 


T G C GC C AC C C G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GG GAC AG GT GG GT GAGAAG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 
T GCGCAACCCGGC CCAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 


480 


Db 


421 


480 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II 1 1 1 1 1 1 1 1 1 1 1 II II 1 II 1 1 II II 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 II 
CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


481 


540 


Qy 


541 


GAT GGCAAGGAGC T CAAC C G C AG C C GAGAC AT T C G CAT CAAAT AT GGCAAC G G CAGAAAG 

II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 | | | | | M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GAC GG CAAGGAGCT CAAC C GGAGT C GT GAC AT T C G CAT CAAGT AT G G CAAC G G CAGAAAG 


600 


Db 


541 


600 


Qy 


601 


AACTCACGACTACAGTTCAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GCGAG 
MINIM II II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 III 
AACTCACGGCT ACAGTTCAACAAAGT GAAGGT GGAGGAC GCT GGAGAGTACGT CT GTGAG 


660 


Db 


601 


660 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
M 1 1 M 1 1 1 II 1 1 1 1 1 1 II II II 1 1 1 1 | | M 1 1 1 1 1 1 II II II II II 1 1 1 I 
GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 


720 


Db 


661 


720 


Qy 


721 


ACCACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCT AT 
M II 1 II 1 1 1 1 II 1 1 1 1 II II 1 1 1 1 1 I I 1 1 1 1 I I M 1 II 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 
AC CACT CTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGC CAAGT CCTAC 


780 


Db 


721 


780 


Qy 


781 


T G C GT CAAT G GAG GC GT C T G CTACTACAT C GAG GGC AT CAAC C AG CT CT C C T GCAAAT GT 

M II II 1 1 II 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II 1 II 1 II 1 1 1 1 1 1 

T GT GT GAAT GGAGG C GT GT GC T AC T ACAT C GAAGGC AT CAAC CAAC T C T C C T GCAAAT GT 


840 


Db 


781 


840 



Qy 841 C CAAAT G GAT T C TT C GGACAGAGAT GT T T GGAGAAACT G CC TT T GC GAT T GT AC AT GC C A 9 00 

I I I I I I I I M I I I I I I I I I I II I I I I I I I | | I I I I I I I I I | | | | | | | | | | | | | | | | | M 
Db 841 C CAAAC G GAT T C TT C GGACAGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT ACAT GC C A 9 00 

QY 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I I M I I I M I I II I I I II I I I I I || | I | | | | | | | | | | | | | | | | || | | | | | | || | || | | 
°k 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

Qy 961 TCAACTTCTCCAAGCACCTTGGATTTGAATTAAA 994 

I I I I II I I I II I I I I I I I I I I I I I I I I I I | | | | | 
Db 961 T CAACT T C T C CAAGC AC CTT GGAT T T GAAT T AAA 994 



KW 
KW 



RESULT 4 
AAS18020 

ID AAS18020 standard; cDNA; 897 BP. 
XX 

AC AAS18020; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2beta, NRG-2beta. 
XX 

KW Human; ss; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

KW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 
neurodegenerative disorder; peripheral neuropathy; 
sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer f s disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. . 897 

FT /*tag= a 

FT /product= "NRG-2beta" 

XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2 001. 
XX 

PF 23-MAY-2001; 2001WO-US0168 96 . 
XX 

PR 23-MAY-2000; 2000US-0206495P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11636. 



XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell. 
XX 

PS Claim 57; Fig 8; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG- 2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease), ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting. smooth muscle 

CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson's disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 

CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 

CC mitogenesis. The present sequence encodes NRG-2beta 

XX 

SQ Sequence 897 BP; 200 A; 261 C; 282 G; 154 T; 0 U; 0 Other; 

Query Match 85.4%; Score 849; DB 6; Length 897; 

Best Local Similarity 98.3%; Pred. No. 4.7e-195; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 0; 



Qy 


l 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M | | | | | | | | | | | | M 1 1 1 1 1 1 1 1 1 1 1 1 I 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


l 


60 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 I I I M 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I | II 1 1 1 1 1 1 
TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


61 


120 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 
1 M 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I | | | | | | | || | | | || || || | | | | | | | | | || | || || | | | | | 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


121 


180 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

N 1 1 1 1 1 1 M 1 1 1 1 M 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | | | | | | | | | | | | 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


181 


240 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | M 1 1 1 1 1 1 I 1 I I II I I | | | I | | | | | | | | M 1 1 1 II 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


241 


300 



Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 M 1 1 1 I 1 1 1 1 | | | j | | t 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I M 1 1 1 1 1 1 1 i 1 1 1 1 
CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


301 


360 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 
M 1 1 1 1 1 1 1 1 1 1 1 1 M | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 | 1 1 1 1 1 | | 1 I 
C C C CT CGAT AC C AAC GG CAAAAAT C T CAAGAAAGAGGT GG GCAAGAT C CT GT GC ACT GAC 


420 


Db 


361 


420 


Qy 


421 


T GC GC C AC C C G G C C C AAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAG 

1 1 M M 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 

T GC GC CAC C C GG C C CAAGT T GAAGAAGAT GAAGAGC C AGAC G G GAC AGGT GGGT GAGAAG 


480 


Db 


421 


480 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
1 1 1 1 M 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 || | | | || | | M II 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 I 1 | 
CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 


Db 


481 


540 


Qy 


541 


GAT G GCAAGGAGC T CAAC C GC AGC C GAGAC AT T C GCAT CAAAT AT GG CAAC G GC AGAAAG 

1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 

GAT G GCAAGGAGC T CAAC C GC AGC C GAGAC AT T C GCAT CAAAT AT GG CAAC G GC AGAAAG 


600 


Db 


541 


600 


Qy 


601 


AACT CACGACTACAGTTCAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GCGAG 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 I M | | | M II 1 1 1 II II II 1 II 1 1 1 1 M I I I 

AACT CACGACTACAGTTCAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GC GAG 


660 


Db 


601 


660 


Qy 


661 


GC C GAGAAC AT C CT G GGGAAG GAC AC CGTCCGGGGCCGGCTT T AC GT CAAC AGC GT GAG C 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 | | | | | | | M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 | | | | | M | | | | 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 


Db 


661 


720 


Qy 

Db 


721 
721 


AC CAC C C T GT CAT CCTGGTCGGGG CAC G C C C G GAAGTGCAAC GAGAC AGC CAAGT C CT AT 
1 1 1 1 1 M 1 1 M 11 1 1 1 1 1 1 1 II 1 1 II I I I | | | | | || | | | | | | | | | | | || | | | | | | || | | | 
ACCACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCTAT 


780 
780 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTrTrrTr^rA A ATPT 

1 1 1 1 M 1 M 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 


ft a n 
o *± u 


Db 


781 


840 


Qy 


841 


C CAAAT G GATT C T T C G GAC AGAGAT GT TT GGAG 873 

II 1 1 1 1 1 III 1 1 1 1 1 1 1 II 

CCT GTGGGATACAC C GGGGACAGGT GT CAGCAG 873 




Db 


841 





RESULT 5 
AAV43674 

ID AAV43674 standard; cDNA; 3076 
XX 

AC AAV43674; 
XX 

DT 29-SEP-1998 (first entry) 
XX 

DE Receptor type tyrosine kinase 
XX 

KW Receptor type tyrosine kinase 

KW nervous disease; cancer; ss. 
XX 

OS Rattus sp. 
XX 



BP. 

ErbB ligand encoding cDNA. 
ErbB; ligand; diagnostic agent; 



FH Key Location/Qualifiers 

FT CDS 232. .2814 

FT /*tag= a 

FT /product^ "ligand of receptor type tyrosine kinase ErbB" 

XX 

PN JP10179166-A. 
XX 

PD 07-JUL-1998. 
XX 

PF 25-DEC-1996; 96 JP-00356998 . 
XX 

PR 25-DEC-1996; 96 JP-00356998 . 
XX 

PA (HIGA/) HIGASHIYAMA S. 
XX 

DR WPI; 1998-430952/37. 

DR P-PSDB; AAW63700. 
XX 

PT Gene coding the ligand of the tyrosine kinase ErbB receptor - useful for 

PT diagnosing and treating nervous diseases and cancer. 

XX 

PS Example; Page 9-13; 17pp; Japanese. 
XX 

CC This cDNA encodes the ligand of receptor type tyrosine kinase ErbB. A 

CC prokaryotic or eukaryotic host cell transformed by a recombinant vector 

CC containing the encoding DNA can be used for the recombinant production of 

CC the protein. The invention provides a method for inhibiting the formation 

CC of the ligand of receptor type tyrosine kinase ErbB in an animal using an 

CC antibody recognizing the protein. The ligand of the tyrosine kinase ErbB 

CC receptor and associated materials can be used for treating or diagnosing 

CC nervous diseases and cancers 
XX 

SQ Sequence 3076 BP; 673 A; 996 C; 944 G; 463 T; 0 U; 0 Other; 

Query Match 80.4%; Score 799.4; DB 2; Length 3076; 

Best Local Similarity 92.2%; Pred. No. 6.3e-183; 

Matches 842; Conservative 0; Mismatches 71; Indels 0; Gaps 0; 

QY 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I M | | | | | | | 

Db 556 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 615 

^ 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I N I I I I I I I I I I I I I I | | | | || | | | | | | | | | M I I I II I I I I I I I I I | | | | | | 
Db 616 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 675 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

N I I I I I M I I I I I I I I I Ml M | | M | I | | | | | | I I I I II II I I I I I M I I I 
Db 67 6 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 735 

Qy 18 1 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I M I I I II I I I I I I I I I I I I II I II || | | | || | | | | | | | | | | || | | 

Db 736 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 795 

Qy 24 1 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M I I I I I I I I I I I I I I I I I I I Ml I I I I I I | | | | | | | | | | | | | | || | 

Db 796 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 855 



Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 I 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 


Db 


856 


915 


Qy 


361 


C C C C T CGAT AC CAAC GGCAAAAAT C T C AAGAAAGAGGT G G GCAAGAT C C T GT GCACT GAC 

1 1 MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | | || | | | | | | | | | | | | | | | | | 

C C GGT CGAC C C T AAC G GCAAAAACAT CAAGAAAGAGGT GG G CAAG AT C CT GT G CAC T GAC 


420 


Db 


916 


975 


Qy 


421 


T GC GC CAC C C G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGAC AGGT G G GT GAGAAG 
Mill II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I II I I I | | || | | | | | | in | | | | | | | | | | | M 
T G C GCAAC C C GG C C CAAGC T GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GG GC GAGAAG 


480 


Db 


976 


1035 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II MMI M 1 1 1 1 II 1 1 1 II II || || M | || | | | | | | | | || | | || | | | | | 

CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


1036 


1095 


Qy 


541 


GAT G GCAAGGAG CT CAAC C GC AGC C GAG AC AT T C GCAT CAAAT AT GGCAAC GGC AGAAAG 

M M II 1 1 II II 1 1 1 1 1 II || || | | | | M | M | | | | | 1 | | | | | | | | | 1 1 1 1 1 1 1 1 

GAC G GCAAG GAG CT CAACC GG AGT C GT GAC AT T C GCAT CAAGT ATGG CAAC GGC AGAAAG 


600 


Db 


1096 


1155 


Qy 


601 


AAC T CAC GAC T AC AGT T CAACAAG GT GAAG GT GGAGGAC G C T G GGG AGT AT GT CT G C GAG 

1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 | | Ml 

AACT CAC GG C T AC AGT T CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT AC GT CT GT GAG 


660 


Db 


1156 


1215 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 

M M M 1 1 1 1 1 II 1 1 II 1 1 M 1 1 1 II 1 1 II II 1 1 M 1 1 1 1 1 1 1 1 1 II 1 M I 

GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 


720 


Db 


1216 


1275 


Qy 


721 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 
M M 1 Mill II 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 II 1 1 1 1 II 1 1 1 1 M 1 II 1 1 1 1 1 I 
ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 


780 


Db 


1276 


1335 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 

M M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 1 1 M 1 1 II II 1 II 1 1 

T GT GT GAAT GGAG GC GT GT GCT AC TAC AT CGAAG GCAT CAAC CAACT CT C CT G CAAAT GT 


840 


Db 


1336 


1395 


Qy 


841 


CCAAATGGATTCTTCGGACAGAGATGTTTGGAf;AAArTr;rrTTTrrra r rT , r r rar7i^rrr , r^a 

Mill 1 1 1 1 1 MM II II 1 1 II I I || || | || | | || | || | || || | || 

C CAAAC GGATT CT T C GGAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C A 


q n n 
y U U 


Db 


1396 


1455 


Qy 


901 


GAT C CT AAG C AAA 913 
M II II 1 1 1 II II 
GAT C CT AAG C AAA 14 68 




Db 


1456 





RESULT 6 
ABS56035 

ID ABS56035 standard; cDNA; 1863 BP. 
XX 

AC ABS56035; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human membrane-bound splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 



KW glycoprotein liganci; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 643. .1863 

FT /*tag= a 

FT /partial 

FT /product= "Membrane-bound splice variant of Don-l" 

FT /note= "This sequence lacks a stop codon" 

XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-00096241 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/ ) GEARING D P. 
PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71638. 
XX 

PT Novel Don-l polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-l, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 4; Fig 3; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-l, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-l polypeptides 

CC are glycoprotein ligands . Both murine and human Don-l sequences are 

CC cloned. The mouse Don-l gene maps to chromosome 18. Don-l polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-l 

CC polypeptides are useful for detecting Don-l in a sample. The Don-l 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-l, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-l may 

CC be used in gene therapy. The present sequence encodes human membrane- 

CC bound splice variant of Don-l 

XX 

SQ Sequence 1863 BP; 422 A; 602 C; 553 G; 286 T; 0 U; 0 Other; 



Query Match 



79.3%; Score 788; DB 7; Length 1863; 



Best Local Similarity 97.6%; Pred. No. 3.1e-180; 

Matches 8 98; Conservative 0; Mismatches 5; Indels 17; Gaps 9; 

QY 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

INI I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | M I I I I I || I I I I II I I I I 

Db 213 ATGAGGCGCGACCCGGCCCCC — CTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 270 

QY 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I II I I II I I I I I I I I I I | || | | | | | | | | | | | | | || 

Db 271 TACTCGCCCAGCCTCAAGTCA— GCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 328 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I | | | || || | | | | | | | || | | | | | | | | | || 
Db 32 9 GGCAAGGTACAGGGGCTGGT— CAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 386 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I M I I I I I I I I I I I I I I I I I I || | | | | | | | | | | | | | | | | | | | | | || | | | || | | || | | 
38 7 CCCGCCTCGGGTCGGGTGGCG--GGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 4 44 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 3 00 

I I I I I I I I I I I I I I I I I I I I I || | | || || | || || 

Db 445 GGGCTGCAGCGCGAGCAGGTG — CAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 5 02 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 3 60 

I M I I II I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I || I | | || || 
Db 503 CGCTACATCTTTTTCCTGGAG--CACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 560 

Qy 361 C C C CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT G G GCAAGAT C C TGT GC ACT GAC 420 

I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I || | | | | | | | | || | | | | | | | 
Db 561 CCCCT-GATACCAACGGCAAAA — CT CAAGAAAGAGGT GGGCAAGATCCTGTGCACTGGC 617 

Qy 421 T G C GC CAC C C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC G G GACAG GT GGGT GAGAAG 4 80 

I I I M M I I I II I II I I I I II I I I I | | M | | | | | | | | | | | | | | | | | M | | | || | | | | | 
Db 618 TGCGCCACCCGGCCCAAGTTGA— AAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAG 675 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I M I M I I I I I M II I I I II I I I I I I I I I I | | | | | | | | | | | | | | | || | | | | | M | | | 
Db 676 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 735 

Qy 541 GAT GGCAAGGAGCT CAAC C G CAG C C GAGAC AT T C GCAT C AAAT AT GGCAAC G G C AGAAAG 600 

I I I I I M I I I I I I M I I I I I I I I II II I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I 
Db 736 GAT G GCAAGGAGCT CAAC C G CAG C C GAGAC AT T C GCAT CAAAT AT GG CAAC G G CAGAAAG 795 

Qy 601 AACT C ACGACT AC AGT T CAACAAG GT GAAG GT G GAGGAC GC T GG GGAGT AT GT C T GC GAG 660 

I I M I I I I I I M M I I II I I I I I I I I I II I I I I I I I I I | | | | | | | | | | | M | M | | | | | | 
Db 796 AACT CAC GACT ACAGT T CAACAAG GT GAAGGT GGAGGAC G CT G G G GAGT AT GT C T GC GAG 855 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 856 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 915 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGT^AGTGCAACGAGACAGCCAAGTCCTAT 780 

I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I | | I | || | | | | | | | | | 
Db 916 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 975 

Qy 7 81 T G C GT CAAT GGAG G CGT C T GC T AC T AC AT C GAG G GCAT CAAC CAG C T CT C C T GCAAAT GT 840 

I 1 I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | || | | | | | || | | | | | | | | | 



Db 976 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1035 

QY 841 CCAAAT G GAT T C T T C G GACAGAGAT GT T T GGAGAAACT G C CT T T GC GAT T GT AC AT GC C A 900 

M I I I I I I II I I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I | | I I I I II II I I 
Db 1036 CCAAAT G GAT T C T T C G GACAGAGAT GT T T GGAGAAACT G C C T T T GCG AT T GTAC AT G C C A 1095 

Qy 901 GAT CCT AAGCAAAGT GT C CT 920 

I M I I I I I I I I I I III 
Db 1096 GAT C CT AAGCAAAAGCAC CT 1115 



RESULT 7 




AAT£ 


37922 




ID 


AAT87922 standard; cDNA; 3441 BP. 


XX 






AC 


AAT87922; 




XX 






DT 


18-DEC-1997 


(first entry) 


XX 






DE 


Rat cerebellum derived growth factor 1 cDNA. 


XX 






KW 


Rat; cerebellum derived growth factor; CDGF1; screening; binding; 


KW 


modulation; 


erbB type receptor; identification; indication; risk; 


KW 


proliferation; differentiation; induction; neuron; hyperplasia; 


KW 


stem cell culture; intracerebral graft; alleviation; repair; 


KW 


behavioural 


defect; nervous system; central; peripheral; nerve; 


KW 


prothesis; damage; entubulation; cell survival; treatment; injury; 


KW 


trauma; ischaemia; ischemia; stroke; infection; disorder; inflammat. 


KW 


neurodegeneration; disease; Parkinson ' s ; Huntingdon ' s ; 


KW 


amylotrophic 


lateral sclerosis; sensory; retina; 


KW 


spinocerebellar degeneration; multiple sclerosis; neoplasia; 


KW 


amalignant glioma; medulloblas toma ; neuroectodermal tumour; ds . 


XX 






OS 


Rattus rattus . 


XX 






FH 


Key 


Location/Qualifiers 


FT 


CDS 


180. .2444 


FT 




/*tag= a 


FT 


sig_peptide 


180. .248 


FT 




/*tag= b 


FT 


mat_peptide 


249. .2441 


FT 




/*tag= c 


FT 




/product^ "cerebellum_derived growth factor" 


XX 






PN 


WO9709425-A1 




XX 






PD 


13-MAR-1997. 




XX 






PF 


09-SEP-1996; 


96WO-US014484. 


XX 






PR 


08-SEP-1995; 


95US-00525864. 


XX 






PA 


(HARD ) HARVARD COLLEGE. 


PA 


(STRD ) UNIV 


LELAND S STANFORD. 


XX 






PI 


Chang H; 




XX 







DR WPI; 1997-192900/17. 

DR P-PSDB; AAW27536. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 17; Page 63-66; 94pp; English. 
XX 

CC The present sequence encodes rat cerebellum derived growth factor 1 

CC (CDGF1), which can be used to screen for modulators of CDGF binding to 

CC erbB type receptors. Identification of a modification or mutation in a 

CC CDGF gene, or aberrant expression of a CDGF gene or levels of soluble 

CC CDGF may be used to indicate the risk of unwanted cell proliferation or 

CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson 1 s 

CC and Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblastomas and 

CC neuroectodermal tumours 

XX 

SQ Sequence 3441 BP; 777 A; 1057 C; 1015 G; 592 T; 0 U; 0 Other; 



Query Match 74.3%; Score 738.6; DB 2; Length 3441; 

Best Local Similarity 90.4%; Pred. No. 3.1e-168; 

Matches 789; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


180 


239 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 1 M M 1 M II 1 1 1 1 1 1 1 1 1 1 I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | || | || 
TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


240 


299 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 III II 1 1 | | | MINIM 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


180 


Db 


300 


359 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

1 1 M MM II II | || 1 1 II M 1 1 II 

CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


360 


419 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

1 M 1 1 1 1 II II M 1 1 II 1 1 1 1 1 1 1 1 1 || | | 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


Db 


420 


479 



Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

I t I 1 1 \ 1 1 1 I I I i 1 l t i i i i i i i i i i it i i i i * i i i t i ■ i 

1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I II I I | I | | | | | | | I I I I | | | | || I I 1 | I II 

CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 


Db 


480 


539 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 

|| 1 \ \ | i ijiii ill it i i i t i j i i 

1 ' 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 II 1 1 1 I I I I I | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
C C GGT C GAC C CTAAC GG CAAAAAC AT CAAGAAAGAG GT G GGC AAGAT CC T GT GCACT GAC 


420 


Db 


540 


599 


Qy 


421 


T GC GC CAC C C GG C C CAAGT T GAAGAAGAT GAAGAG C CAGAC GGGAC AGGT GGGT GAGAAG 

J 1 1 1 1 | | ] 1 1 1 | | J 1 i I ill llilllllii . . . 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 ILM 1 

T GC GCAACC C GG C C C AAGC T GAAGAAGAT GAAGAGT CAGACAG GAGAG GT GGGC GAGAAG 


480 


Db 


600 


659 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
N 1 N 1 1 1 1 1 1 1 II 1 1 1 1 II II II II 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 I 1 1 
CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


660 


719 


Qy 


541 


GATGGCAAGGAGCT CAACC GCAGCCGAGACATTCGCAT CAAATAT GGCAACGGCAGAAAG 

II 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 II II 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I M I I I I || | | 

GAC GGCAAGGAGCT CAAC C GGAGT C GT GAC AT T C G CAT CAAGT AT G GCAAC GGCAGAAAG 


600 


Db 


720 


779 


Qy 


601 


AACT CAC GACT ACAGT T CAACAAGGT GAAG GT GGAGGAC G CT GG GGAGT AT GT CT G C GAG 
1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 | | | || | | | | | || | | II 1 1 1 1 1 1 III 
AACT CACGGCTACAGTT CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGTACGT CTGTGAG 


660 


Db 


780 


839 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
N 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 MINI 
GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 


720 


Db 


840 


899 


Qy 

Db 


721 
900 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I I | | | | | | || | | | | | || 
ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 


780 
959 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATCIT 

II II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | 

TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAGTGT 


O H. \J 


Db 


960 


1019 


Qy 


841 


C CAAAT GGATT CTT CG GAC AGAGAT GT T T GGAG 873 

M 1 1 1 1 1 III 1 1 1 II 1 III 

C CT GT GGGAT ACAC C GGGGAC AG GT GT C AGC AG 1052 




Db 


1020 





RESULT 8 
AAV17813 

ID AAV17813 standard; cDNA; 1607 BP. 
XX 

AC AAV17 813; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Mus rausculus don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 

KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 

KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 

KW wound healing; secreted protein; ss. 



OS Mus musculus. 
XX 

FH Key Location/Qualifiers 

FT CDS 79. .624 

FT /*tag= a 

FT /note= "secreted don-1 polypeptide' 1 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 

PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48380. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 4; Fig 2; 121pp; English. 
XX 

CC The sequence is that of a murine don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 

XX 

SQ Sequence 1607 BP; 365 A; 500 C; 480 G; 262 T; 0 U; 0 Other; 



Query Match 55.1%; Score 547.2; DB 2; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 4e-122; 

Matches 57 6; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I 
Db 2 CTAAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GACT GCGC CAC C C 61 

QY 431 GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGACAG GT G G GT GAGAAGCAAT C G CT GA 490 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I II I I I I I I I I I I I 
Db 62 GGCCCAAGCT GAAGAAGAT GAAGAGC C AGAC AG GAGAGGT GGGT GAGAAGCAGT C GCT CA 121 



QY 4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I M I I I I M I I M M M I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCT CAACC GCAGCCGAGACATT C GCAT CAAATAT GGCAAC GGCAGAAAGAACT CACGAC 610 



Db 182 AACT CAAC C GGAGT C GT GATATT CGC ATCAAGTAT GGCAAT GT C AGAAAGAACT CAC GGC 241 

Qy 611 TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 67 0 

I I I I I I I I I I M I I I II I I I I I I I I I II I I I M I I I I I I I I I I I I II I I I I I I I 
Db 242 T AC AGT T C AAC AAAGT GAGGGT G GAG GAT G C C G G GGAGT AC GT CT GT GAG G C C GAGAAC A 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

M I I I I II I I I I I I I I I I I I I I I I I II I I I I I M I I I I I | | | | | M I I I I II 
Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I M I I I I I I I I M M I I II I I I I I I I I I I I II II I I I I I I I I I I I II II I I I I 
Db 362 CAT C CT G GT C G GGAC AT GC C CGGAAGT G C AAT GAGAC C GC CAAGT C CT ACT GT GT GAAT G 421 

Qy 7 91 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 850 

M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I | I | 
Db 422 GAGGC GT GT GCT ACTACAT C GAGGGCAT CAACCAGCT CT C CTGCAAAT GT CC AAAC GGAT 4 81 

Qy 851 T CT T CG GACAGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT G C C AGAT C C TAAGC 910 

I M I I I M I I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 4 82 T CT T CGGAC AGAGAT GT T T GGAGAAAC T G C CTTT GC GAT T GT AC AT G C C AGAT C C TAAGC 541 

Qy 911 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 970 

I I I I I I I M I I I I I I I M I I I I I I I I I I II I I I I II II I I I II I I I I I I I || | | | | | | || 
Db 542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

Qy 971 CAAGC AC CT T G G ATT T GAAT T AAA 994 

I I I I I I I I II I I I I I I I I I I I II 
Db 602 CAAGCAC CTT GGATT T GAATT GAA 625 

RESULT 9 
ABS56034 



ID ABS56034 standard; cDNA; 1561 BP. 
XX 

AC ABS56034; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding murine secreted splice variant of Don-1. 
XX 

KW Murine; Don-1; epidermal growth factor; EGF; neuregulin; mouse; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic; gene therapy; chromosome 18; gene; ss. 

XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 
FT CDS 78. .623 

FT /*tag= a 

FT /product= "Secreted splice variant of Don-1" 

XX 

PN US2002127594-A1. 



PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2 002US-00096241 . 
XX 

PR 22-JUN-2000; 2 000US-0059978 9 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71637. 



XX 
PT 
PT 



Novel Don-1 polypeptide useful for stimulating proliferation of cells, 
for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 4; Fig 2; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands. Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence encodes murine secreted 

CC splice variant of Don-1 

XX 

SQ Sequence 1561 BP; 361 A; 479 C; 465 G; 256 T; 0 U; 0 Other; 



Query Match 53.8%; Score 535.2; DB 7; Length 1561; 

Best Local Similarity 92.1%; Pred. No. 3.2e-119; 

Matches 575; Conservative 0; Mismatches 48; Indels 1; Gaps 1; 

Qy 371 C CAAC GGCAAAAAT CT CAAGAAAGAG GT GGGCAAGAT C CT GT GC ACT GAC T GC G C CAC C C 430 

I I I I M I M I I I I I II I I I I I I I I I I I I I I I I I | | | | | | | | M I I I I I II I I I I II 
Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCA-CC 60 

QY 431 G G C C CAAGT T GAAGAAGAT GAAGAG C C AGAC GGG ACAG GT G G GT GAGAAGCAAT C G CT GA 4 90 

MINIM I I I I I I II I I || I I I I I | | | || Ml I II I I I I I I I I I I I I I I I I I I I 
Db 61 GGCC CAAGCT GAAGAAGAT GAAGAGC CAGACAGGAGAGGT GGGTGAGAAGCAGT C GCT CA 120 



Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I II I I II || I I I I I I I I Mill II I I I II I | | || | | M I I I II 

Db 121 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 180 

Qy 551 AGCT CAACC GCAGCCGAGACATT CGCAT CAAATAT GGCAACGGCAGAAAGAACT CAC GAC 610 



Db 181 AACT CAAC C G GAGT C GT GAT AT T C G CAT CAAGT AT GG CAAT GT C AGAAAGAACT C AC GG C 24 0 

Qy 611 T ACAGT T CAACAAG GT GAAG GT GGAGGAC G C T G GGGAGT AT GT C T G C G AGG C CGAGAAC A 670 

I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 300 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I M I I I II I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I | | | | I I 
Db 301 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 360 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I M I I I I I I I I II I I I I I I | I M I II I I I I I I I II I I I I I I I I II II I I I I 
Db 361 CAT C C T G GT C GG GAC AT GC C C G GAAGT GCAAT GAGAC C GC CAAGT C CT ACT GT GTGAAT G 420 

Qy 7 91 GAG GC GT CT GC T ACT AC AT C GAGG G CAT CAAC CAG CT CT C C T G CAAAT GT C CAAAT GGAT 850 

M I I II I I I I I I I I I I II I I II I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 421 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 480 

Qy 8 51 T CT T C G GAC AGAGAT GT TT GGAGAAACT GC C T T T GC GAT T GTAC AT GC CAGAT C CT AAG C 910 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I | M || | || M II I I I I I I I I I I I I I I | | I I 
Db 481 T CTT C GGACAGAGAT GT TT GGAGAAACT GCCTTTGC GAT T GTAC AT GC CAGAT C CT AAG C 54 0 

Qy 911 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTC GCAAT GGTCAACTTCTC 970 

I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 541 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTC GCAAT GGTCAACTTCTC 600 

Qy 971 C AAG C AC C T T G GAT T T GAAT T AAA 9 94 

I I I II I I I II I I I I I I I I I I I II 
Db 601 C AAG C AC C T T G GAT T T GAAT T GAA 624 

RESULT 10 
AAV17816 



ID AAV17816 standard; cDNA; 2268 BP. 
XX 

AC AAV17 816; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
KW wound healing; transmembrane; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT CDS 69. .2012 

FT . /*tag= a 

FT /note= n don-l polypeptide" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 



XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 

PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTI CS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48383. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma (s ) , and for wound healing. 

XX 

PS Claim 4; Fig 7; 121pp; English. 
XX 

CC The sequence is that of a human don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 

XX 

SQ Sequence 2268 BP; 502 A; 735 C; 700 G; 331 T; 0 U; 0 Other; 

Query Match 49.5%; Score 492; DB 2; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 9.4e-109; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

QY 363 C CT C GAT AC CAAC GGCAAAAAT C T CAAGAAAGAGGT GGGCAAGAT C CT GT GCAC T GAC T G 422 

I I I I I I I III I I I I I I I I II I I I 

Db 98 C C GC GG CAAGAAG C AC C C AGAGG G GAGGAAGC GGGAGAG G GAGC C C GAT C C C G G G GAGAA 157 

423 C GC C AC C C GGC C CAAGTT GAAGAAGAT GAAGAGC CAGAC G GGACAGGT GGGT GAGAAGC A 4 82 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | M I I I I I I I I 
Db 158 AGC CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGC A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I N i II I I I I II I I I I I I I I I I I I I I I I I i I I I M I I II I I | | || | | | | | | | || 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 27 7 

Qy ^43 T G G CAAGGAG CT CAAC CGC AGC C GAGAC ATT C G CAT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I I I I I I I I I I M I I I I I I I I I I M I I II I I I I I I I I I II I I I I I I I I I I I I I I | | | || || 
Db 2 78 T GGCAAG GAGCT CAAC CG C AGC C GAGAC AT T C G CAT CAAAT AT GGCAAC GG CAGAAAGAA 337 

Qy 6 °3 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

N I I I M I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I II I I II I I I I I M | | || | | | 
Db 33 8 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 66 3 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I II I I I I I I II I I I | | || | || | || 
Dt> 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 



Qy 


723 


CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I 1 I M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 | | | | | I I I 1 M 1 1 1 M 1 I 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 


782 


Db 


458 


517 


Qy 

Db 


783 
518 


CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 M 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 
CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 


842 
577 


Qy 


843 


AAATGGATTCTTCGGACAGAGATGTTTGGAKAAArTr:r 4 rTTTr^rraT r rr:TArzxTf^rr ara 
1 1 1 M 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I I I I 

AAAT GGAT T CT T C G GAC AGAG AT GT T T G GAGAAAC T G C CT T T GC GAT T GT AC AT G C CAGA 




Db 


578 


637 


Qy 


903 


TCCTAAGCAAAGTGTC 918 
1 1 1 1 1 1 1 1 1 1 1 II 
TCCTAAGCAAAAAGCC 653 




Db 


638 





RESULT 11 
ABS56036 

ID ABS56036 standard; cDNA; 1474 BP. 
XX 

AC ABS56036; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human second splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT CDS 68. .1473 

FT /*tag= a 

FT /partial 

FT /product= "Second splice variant of Don-1" 

FT /note= "This sequence lacks a stop codon" 

FT /transl_except= (pos:107. .108, aa:Lys) 

FT /note= "This codon has an apparent 1 nucleotide deletion 

FT which alters the reading frame" 

XX 



PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-00096241 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 



PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71639. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 4; Fig 4; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence encodes human second splice 

CC variant of Don-1 

XX 

SQ Sequence 1474 BP; 335 A; 472 C; 451 G; 216 T; 0 U; 0 Other; 

Query Match 49.4%; Score 491; DB 7; Length 1474; 

Best Local Similarity 95.3%; Pred. No. 1.5e-108; 

Matches 506; Conservative 0; Mismatches 25; Indels 0; Gaps 0; 

Qy 38 8 AAGAAAGAG GT G G GCAAGAT C CT GT GC AC T GACT G C GC C AC C C G GC C CAAGT T GAAGAAG 447 

I III I I I I II I I || M | | | | | | | | | | | | | | | | || | | | | 

Db 121 AGGAAGC GGGAGAGGGAGC C C GAT C C C G GG GAGAAAGC C AC C C G G C C CAAGT T GAAGAAG 180 

QY 448 AT GAAGAGC CAGAC GGGACAG GT G G GT GAGAAG CAAT C GC T GAAGT GT GAG GC AGCAGC C 507 

I I I I M I I I I I I I I I M I I I II II I I I I I I I | I | | || | || | | | MMI I 

Db 181 AT GAAGAGC CAGAC GGGACAG GT G G GT GAGAAG CAAT C G C T GAAGT GT GAGG C AG C AGC C 24 0 

QY 508 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 567 

I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I M | I I I I I I I I I I I 
Db 241 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 300 

Qy 568 GACATT C GCAT CAAATAT GGCAAC GGCAGAAAGAACT C AC GACTACAGTT CAACAAGGT G 627 

I I M I I I I I I I I I I I I I I I I II I I || I | | | | | | | | | | | | | | | | | | | | | | | | | || | | | | || 
Db 301 GACATT CGCAT CAAATAT GGCAACGGCAGAAAGAACT CAC GACT ACAGTT CAACAAGGT G 3 60 

Qy 628 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 687 

IN I I I I || | | | | | | | | | | | | | | | | | | | || | | | | | || || | | 

Db 361 AAG GT G GAGGAC G C T G GG GAGT AT GT CT G C GAG GC C GAGAACAT CC T GG G GAAG GACAC C 420 

Qy 688 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 747 
II I I I I I I I I II I II I I II I II I I I I I I I I I I II I I I I I I I I 



Db 421 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 480 

Qy .748 GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 8 07 

I I I I M I I I I I I I I II I I I I I I I I I I I II I I I I | I I I I I I I I | | | | | | M M I I I I I I I I 
Db 481 GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 54 0 

QY 808 AT C GAGGG C AT C AAC C AGCT CT C C T GCAAAT GT C C AAAT GGAT T CT T C G G ACAGAGAT GT 8 67 

I I I I M I I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I | | | | | | || | | | | || || | | 
Db 541 AT C GAG GGC AT CAAC C AGC T CT C CT GCAAAT GT C CAAAT G GAT T CT T C G G ACAGAGAT GT 600 

Qy 868 T T GGAGAAAC TGCCTTTGC GAT T GT AC AT G C C AGAT C C T AAG CAAAGT GT C 918 

I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I | | | 
Db 601 T T GGAGAAAC T G C CT T T GC GAT T GT AC AT GC C AGAT C C T AAGCAAAAAG C C 651 

RESULT 12 
ABS56045 

ID ABS56045 standard; cDNA; 2266 BP. 
XX 

AC ABS56045; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human third splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens . 



XX 

FH Key Location/Qualifiers 

FT CDS 68. .2010 

FT /*tag= a 

FT /product^ "Third splice variant of Don-1" 

FT /transl_except= (pos:107. .108, aa:Lys) 

FT /note= "This codon has an apparent 1 nucleotide deletion 

FT which alters the reading frame" 

FT /transl_except= (pos:994. .996, aa:Thr) 

XX 



PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2 002US-00096241 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71644. 



PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 4; Fig 7; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands. Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence encodes human third splice 

CC variant of Don-1 

XX 

SQ Sequence 2266 BP; 502 A; 733 C; 700 G; 331 T; 0 U; 0 Other; 



Query Match 49.4%; Score 491; DB 7; Length 2266; 

Best Local Similarity 95.3%; Pred. No. 1.6e-108; 

Matches 506; Conservative 0; Mismatches 25; Indels 0; Gaps 0; 

Qy 388 AAGAAAGAGGT G GG CAAGAT C CT GT GC ACT GACT G C GC C AC C C GGC C CAAGT T GAAGAAG 447 

I I I I I I I I M i I II I I I I I I I I II I I I II I I I I I I I I I 

Db 121 AG GAAG C GGGAGAGGGAGC C C GAT C C C GG G GAGAAAGC C AC C C GGC C CAAGT T GAAGAAG 18 0 

Qy 448 AT GAAGAGC C AGAC GG G AC AGGT G GGT GAGAAGCAAT C G CT GAAGT GT GAGGCAG CAG C C 507 

I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 181 AT GAAGAG C C AGAC GG GAC AGGT GG GT GAGAAGCAAT C GCT GAAGT GT G AG GC AGC AGC C 24 0 

Qy 508 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 567 

I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 30 0 

Qy 568 GAC AT T C GC AT CAAAT AT GGCAAC GGC AGAAAGAACT CAC GACT AC AGT T CAACAAGGT G 627 

I I II I I i I I I II I I I I I I I I t I I I I I I II I I I I I i I I I I I I I I I I I I I I I I II I I I I I I I 
Db 301 GACATT C GCAT CAAAT AT GGCAACGGCAGAAAGAACT CACGACTACAGT T CAACAAGGT G 360 

Qy 628 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 68 7 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I II I I I I I I I I I I 
Db 361 AAGGT GGAG GAC GCT GGG GAGT AT GT CT GCGAGG CC GAGAACAT C CT GGG GAAGGACAC C 42 0 

Qy 688 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 747 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 421 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 48 0 

Qy 74 8 GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 807 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



4 81 G C CC G GAAGT GCAAC GAGAC AG C C AAGT C CT AT T G C GT CAAT GGAGGC GT C T GCT ACT AC 54 0 



Qy 808 AT C GAGG G CAT CAAC C AGC T CT C C T G C AAAT GT C CAAAT G GATT C T T C G GAC AGAGAT GT 8 67 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I 
Db 541 AT C GAG G GC AT CAAC C AG CTCTCCTG CAAAT GT C CAAAT GGAT T CT T C G GAC AGAGAT GT 600 

Qy 868 T T G GAGAAACT G C C T T T GC GAT T GT AC AT GC C AGAT C CTAAG CAAAGT GT C 918 

I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I II II M I I I 
Db 601 T T GGAGAAAC TGCCTTTGC GAT T GT AC AT GC C AGAT C CTAAGCAAAAAGC C 651 



AAV17815; 

17-AUG-1998 (first entry) 
Homo sapiens don-1 gene splice variant. 



Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 

pancreas; colon; prostate; gastrointestinal tract; uterus; 
transmembrane; ss. 



breast; liver; 
wound healing; 



Location/ Qualifiers 
69. .1475 
/*tag= a 

/note= "don-1 polypeptide 1 



RESULT 13 
AAV17815 

ID AAV17815 standard; cDNA; 1476 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
XX 
PS 
XX 
CC 
CC 
CC 



Homo sapiens. 

Key 
CDS 



WO9807736-A1. 
26-FEB-1998. 

18- AUG-1997; 

19- AUG-1996; 
19-NOV-1996; 



97WO-US014585. 

96US-00699591. 
96US-00753007. 



(MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 

Gearing DP, Busfield SJ; 

WPI; 1998-169084/15. 
P-PSDB; AAW48382. 



Mouse and human don-1 polypeptide (s ) - useful for treatment of melanomas 
and adenocarcinoma ( s ) , and for wound healing. 

Claim 4; Fig 4; 121pp; English. 

The sequence is that of a human don-1 gene splice variant. Don-1 
polypeptides stimulate proliferation of epithelial cells and thus are 
implicated in melanomas and adenocarcinomas in which epithelial cells 



CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus* 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 

XX 

SQ Sequence 1476 BP; 335 A; 475 C; 450 G; 216 T; 0 U; 0 Other; 

Query Match 49.3%; Score 490.4; DB 2; Length 1476; 

Best Local Similarity 92.6%; Pred. No. 2e-108; 

Matches 515; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 

Qy 363 C CT C GATACCAACGGCAAAAAT CT CAAGAAAGAGGTGGGCAAGAT CCTGT GCACT GACT G 422 

I I I I I I I III I I I I I I I I I I I I II 

Db 9 8 C C G C GGCAAGAAG CAC CCAGAG G G GAGGAAG C GGGAGAG GGAGC C CGAT C C C G G GGAGAA 157 

Qy 423 C G C CAC C C G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GG GT GAGAAG C A 4 82 

I I M II I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I 
Db 158 AGC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GG GAC AGGT G G GT GAGAAGC A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I i I I I II I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T GGCAAG GAGCT CAAC C GC AGC CGAGAC AT T C GC AT CAAAT AT GGCAAC GG C AGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 278 T GGCAAGGAGCT CAACCGCAGCCGAGACAT T CGCAT CAAAT AT GGCAAC GGCAGAAAGAA 337 

Qy 603 CT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GC GAGGC 662 

I I I I I I I I I I I I i I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I II I I I II 
Db 338 CT CAC GACT AC AGTT CAACAAGGT GAAGGT G GAGGAC G CT G G G GAGT AT GT CT GC GAG G C 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I II I I I I I I I I II 1 I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CAC C CT GT CAT C C T G GT C G GG GCACGC C C G GAAGT GCAAC GAGACAG C CAAGT C CT AT T G 7 82 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I 
Db 4 58 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 7 83 C GT CAAT GGAGGCGTCTGCTACTACATC GAG GG CAT CAAC GAG CTCTCCTG CAAAT GTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGAT T C TT C GGACAGAGAT GT T T GGAGAAAC TGCCTTTGC GAT T GT AC AT GC C AGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I II I I 
Db 578 AAAT G GAT T C TT C GC ACAGAGAT GTT T G GAGAAAC TGCCTTTGC GAT T GT AC ATG C CAGA 637 

Qy 903 T C CT AAG C AAAGT GT C 918 

I I I I I I I I I I I II 
Db 638 TCCTAAGCAAAAAGCC 653 

RESULT 14 
AAV17812 

ID AAV17812 standard; cDNA; 2467 BP. 



XX 






AC 


AAV17812; 




XX 






DT 


17-AUG-1998 (first entry) 


XX 






DE 


Mus musculus don-1 gene splice variant. 


XX 






KW 


Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell 


KW 


proli f eration ; 


K 1 IIIllll OT1 * I - "t~m<=i"n "I - * f limnil ■ c Vl n ■ noennyia^nc • 1 nri/f • 
ij i»j.iLLUj.a L-xun ^ L- _L c d unit: II u , L- luilw UJ_ o , o JV-LH , Uc5 OUXldy U.o, J_ UilQ , 


KW 


breast; liver; 


^aiiL. j. cao r ^wj-ljii, piuo LaLC/ yao Liuiii Lcb Llual LlaOL ^ UL6J.US 


KW 


wound healing; 




XX 






OS 


Mus musculus. 




XX 






FH 


Key 


TiOr'^'hi on/fln^l i "Ft otg 
uuua LiUlJ/ \^Lla.j l L1CI5 


FT 


CDS 


79. .1896 


FT 




/*tag= a 


FT 




/note= "transmembrane don-1 polypeptide" 


XX 






PN 


WO9807736-A1. 




XX 






PD 


26-FEB-1998. 




XX 






PF 


18-AUG-1997; 


97WO-US014585. 


XX 






PR 


19-AUG-1996; 


96US-00699591. 


PR 


19-NOV-1996; 


96US-00753007. 


XX 






PA 


(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 


XX 






PI 


Gearing DP, Busfield SJ; 


XX 






DR 


WPI; 1998-16908 


4/15. 


DR 


P-PSDB; AAW48379. 


XX 






PT 


Mouse and human 


don-1 polypeptide (s) - useful for treatment of melanomas 


PT 


and adenocarcinoma (s ) , and for wound healing. 


XX 






PS 


Claim 4; Fig 1; 


121pp; English. 


XX 






cc 


The sequence is 


that of a murine don-1 gene splice variant. Don-1 


cc 


polypeptides stimulate proliferation of epithelial cells and thus are 


cc 


implicated in melanomas and adenocarcinomas in which epithelial cells 


cc 


proliferate out 


of control. Compounds that interfere with don-1 mediated 


cc 


cell proliferation can be used in the treatment of tumours such as 


cc 


melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 


cc 


liver, pancreas 


, gastrointestinal tract, colon, prostate or uterus. 


cc 


Alternatively, 


don-1 polypeptides can be used to stimulate epithelial 


cc 


cell proliferation, e.g. for wound healing 


XX 






SQ 


Sequence 2467 BP; 592 A; 752 C; 706 G; 417 T; 0 U; 0 Other; 



Query Match 46.7%; Score 464.6; DB 2; Length 2467; 

Best Local Similarity 91.0%; Pred. No. 4e-102; 

Matches 494; Conservative 0; Mismatches 49; Indels 0; Gaps 0; 



Qy 371 C CAACG G CAAAAAT C T CAAGAAAGAG GT G G GCAAGAT C CT GT G C ACT G AC T GC G C C AC C C 430 



Db 2 C T AAC GGCAAAAACAT CAAGAAAGAGGT G GGCAAGAT C C T GT GCACT GAC T GC G C CAC C C 61 

Qy 431 G GC C CAAGT T GAAGAAGAT GAAGAGC CAGAC G G GAC AGGT GGGT GAGAAG CAAT C GC T GA 4 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I 
Db 62 GG C C C AAG CT GAAGAAGAT GAAGAG C CAGAC AG GAGAGGT GGGT GAGAAG C AGT C GC T C A 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I II I I I I I I II II II I I I I I I I I Mill II I I I I II I I I I II I I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AG C T CAAC C GC AG C C GAGACAT T C G CAT CAAAT AT GG CAAC GGC AGAAAGAAC T CAC GAC 610 

I I I I I I I I I II II II I I II I I I I I II I I I I I I II I I I I I M I I I I I I I I I I 
Db 182 AACT CAAC C G GAGT C GT GAT AT T C GC AT CAAGT AT G GCAAT GT C AGAAAGAAC T CAC GGC 2 41 

Qy 611 TAC AGT T CAACAAG GT GAAGGT GGAGGAC GC T G G G GAGT AT GT C T G C GAG GC C GAGAAC A 67 0 

I M I I I I I I I I M I I II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 242 TAC AGT T CAACAAAGT GAGC GT GGAGGAT GCCGGGGAGTACGT CT GT GAGGC C GAGAAC A 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 7 30 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M | | | | M I I I I I 
Db 302 T C CT T GG GAAG GAC AC C GT GAGGG GC C GACT C CAT GT CAACAG C GT GAC CAC C ACT C T GT 361 

Qy 7 31 CAT C CT GGT C GG GGC AC GC C C G GAAGT GCAAC GAGAC AG C CAAGT C CT ATT GC GT CAAT G 790 

I M I I I I I I I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I I I I II II I I I I 
Db 362 CAT C CT GGT C G GGAC AT GC C C GGAAGT GCAAT GAGAC C GC CAAGT C CT ACT GT GT GAAT G 421 

Qy 7 91 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 8 50 

I M I I I I I I I I I I I I I I II I I I I II I I I II I I I I I I M I I I I I I I I I I I I I I M I I I I 
Db 422 GAG GC GT GT G C TAC TAC AT C GAGGGC AT CAAC C AG C T CT C C T G CAAAT GT C CAAAC GGAT 481 

Qy 851 T CTT C G GACAG AGAT GT TT GGAGAAAC T G C CT T T GC GAT T GT AC AT G C CAGAT C CTAAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 4 82 T CTT C GGAC AGAGAT GT TT G GAGAAACT GCCTTTGC GAT T GT AC AT GC CAGAT C CT AAG C 541 

Qy 911 AAA 913 

I I I 

Db 542 AAA 544 



RESULT 15 


ABS56033 


ID 


ABS56033 standard; cDNA; 2442 BP. 


XX 




AC 


ABS56033; 


XX 




DT 


14-JAN-2003 (first entry) 


XX 




DE 


cDNA encoding murine membrane-bound splice variant of Don-1. 


XX 




KW 


Murine; Don-1; epidermal growth factor; EGF; neuregulin; mouse; 


KW 


glycoprotein ligand; cell proliferation; cell proliferative disorders- 


KW 


carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 


KW 


cell survival; epithelial cell; wound healing; tumour formation; brain; 


KW 


vulnerary; cytostatic; gene therapy; chromosome 18; gene; ss. 


XX 




OS 


Mus sp. 



XX 

FH Key Location/Qualifiers 

FT CDS 78. .1895 

FT /*tag= a 

FT /product^ "Membrane-bound splice variant of Don-l M 
XX 

PN US2002127594-A1 . 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-00096241 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71636. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 4; Fig 1; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins. Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence encodes murine membrane- 

CC bound splice variant of Don-1 

XX 

SQ Sequence 2442 BP; 587 A; 742 C; 703 G; 410 T; 0 U; 0 Other; 

Query Match 45.9%; Score 455.8; DB 7; Length 2442; 
Best Local Similarity 91.2%; Pred. No. 5.3e-100; 

Matches 495; Conservative 0; Mismatches 47; Indels 1; Gaps 1; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I II I I I I I I II I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II 

Db 2 CT AAC G GCAAAAACAT C AAGAAAGAG GT G GG CAAGAT C CT GT GC ACT GAC T GC GC CA- C C 60 



Qy 



4 31 GGCC CAAGT T GAAGAAGAT GAAGAGC C AGAC GG GACAG GT GGGT GAGAAG C AAT C G CT GA 4 90 



Db 61 GGCCCAAGCT GAAGAAGAT GAAGAGC CAGACAGGAGAGGT GGGT GAGAAGCAGT CGCT CA 120 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II IMM I II I I II I I I I I I I I I I I I II I I I I I 
Db 121 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 180 

Qy 551 AGCT CAACC G C AGC C GAG AC AT T C G CAT CAAAT AT GG C AAC GG C AGAAAGAACT C AC GAC 610 

I I I I I I II I II II II I II II I I I I I I I.IM I II I I I I I II I I I I I I I 
Db 181 AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 240 

Qy 611 TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 67 0 

I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I II I I 
Db 241 T ACAGT T CAACAAAGT GAG G GT G GAGGAT GC C G G GGAGT AC GT C TGT GAGGC C GAGAAC A 300 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II 
Db 301 T C CT T GG GAAG GAC AC C GT GAGGGGC C GACT C C AT GT CAAC AG C GT GAG C AC C ACT C T GT 360 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 7 90 

I I! I I I I I II I I I II I II I I II I I I I I I I I I II I I I I I I I I I I I I II II I I I I 
Db 361 CAT C C T GGT C GGGAC AT G C C C G GAAGT GC AAT GAGAC C G CCAAGT C CT AC T GT GT GAAT G 420 

Qy 7 91 GAGGC GT CT GCT ACT AC AT C GAG GGC AT CAACC AGCT CTCCTG CAAAT GTCCAAATGGAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GAGGC GT GT GCT AC T AC AT C GAGGGC AT CAAC C AGC T C T C C T GCAAAT GT C C AAAC GGAT 4 80 

Qy 8 51 T C TT C GGACAGAGAT GT T T GGAGAAACT G CCT T T GC GAT T GT AC AT GC C AGAT C CT AAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 4 81 T CTT C GGACAGAGAT GT T T G GAGAAACT GCC T T T GC GAT T GT ACAT GC CAGAT C CTAAGC 54 0 

Qy 911 AAA 913 

I I I 

Db 541 AAA 543 



Search completed: August 15, 2004, 05:47:22 
Job time : 481.763 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: August 15, 2004, 05:20:54 ; Search time 90.9371 Seconds 

(without alignments) 
6065.966 Million cell updates/sec 

Title: US-09-864-675-1 
Perfect score: 994 

Sequence: 1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 682709 seqs, 277475446 residues 

Total number of hits satisfying chosen parameters: 1365418 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq: * 

2 : /cgn2_6/ptodata/2/ina/5B_COMB. seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB. seq: * 

5 : /cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 

6 : /cgn2_6/ptodata/2/ina/backf ilesl . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-08-753-007A-5 

; Sequence 5, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES : 33 

CORRESPONDENCE ADDRESS : 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 



; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/ 7 53 , 007A 

; FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 18 84 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

FEATURE : 

; NAME /KEY: Coding Sequence 

; LOCATION: 664... 1883 

OTHER INFORMATION: 
US-08-753-007A-5 



Query Match 90.5%; Score 900; DB 3; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 4.7e-225; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 



Qy 


i 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

| | | | | | | | | I I I 1 1 I 1 M 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 i 1 1 1 M 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 M 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


218 


277 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

| 1 1 M | 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M II M 1 1 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


278 


337 


Qy 


121 


GGCT^AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCT^ACAGCACCCGAGAGCCG 

Ml | M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 M 1 1 1 1 1 1 M 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


338 


397 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

| | I I MINI 1 1 1 M 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 M 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


398 


457 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

I | I I I I I I I I 1 1 II 1 1 1 II M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II M M 1 1 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


458 


517 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 



Db 


518 


Qy 


361 


Db 


578 


Qy 


421 


Db 


637 


Qy 


481 


Db 


697 


Qy 


541 


Db 


757 


Qy 


601 


Db 


817 


Qy 


661 


Db 


877 


Qy 


721 


Db 


937 


Qy 


781 


Db 


997 


Qy 


841 


Db 


1057 


Qy 


901 


Db 


1117 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M M 1 1 1 1 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 57 7 
C C C CT C GAT AC CAAC GG CAAAAAT CT C AAGAAAGAGGT GG GCAAGAT C C T GT G C ACT GAC 42 0 

Mill I I I I I I II I I I M M I II I I I I I I I I I I I I M M I I I I I I II II I I I II I I I I 

CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 63 6 
T GC GC CAC C C GG C C CAAGT T GAAGAAGAT GAAGAG C C AGAC GG GAC AGGT GG GT GAGAAG 480 

| | | | | | | | | | | | | | | I I II I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I 

T G C G CC AC C C G GC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GG GT GAGAAG 696 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

| | | | | | | | | | | I || I I I II I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I M I I I I I 
CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

GAT GGC AAGGAGC T CAAC C GC AGC C G AGACAT T C GCAT CAAAT AT G GCAAC GGC AGAAAG 600 

I | | | | | | | | | | | | I I I I I M I I II I I I I I II I I I I I I I M I I I I I I I I M I I M M I I I I 

GAT G GCAAGGAGCT CAAC C G C AGC C GAG AC AT T C GCAT CAAAT AT GG CAAC G GC AGAAAG 816 
AACT CACGACT ACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTAT GT CTGCGAG 660 

I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I II I I I I M 

AACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTAT GT CTGCGAG 876 
G C CGAGAAC AT C C T G G G GAAGGAC AC C GT CCGGGGCCGGCTT T AC GT CAAC AGC GT GAGC 720 

I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 
AC CAC C CT GT CAT C CT GGTCGGGGCACGCCCGGAAGT GCAAC GAGACAGC CAAGT CCT AT 780 

I I I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I II I I I 

AC CAC C CT GT CAT CCTGGTCG G GG C AC GC C C G GAAGT GCAAC GAGACAGC CAAGT C CT AT 996 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I II I I I I I M I I I I I II I I I 
TGCGTCAATGGAGGCGTCTGCTACTACATCGAGG GCAT CAAC CAGCTCTCCTGCAAATGT 1056 

C CAAAT GGATT C T T C GGACAGAGAT GT TT GGAGAAAC T G C CT T T GCGAT T GT AC AT GC C A 900 

M I M I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I II 

C CAAAT GGAT T CT T C GGACAGAGAT GT TT GGAGAAACT GC CT TT GCGAT T GT AC AT G C C A 1116 

GAT CCTAAGCAAAGTGT CCT 92 0 

I I I I I I I I I I I I I III 

GAT C CT AAGCAAAAG CAC CT 1136 



RESULT 2 
US-09-398-496-5 

; Sequence 5, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 



CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/398 , 4 96 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1884 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: Coding Sequence 
LOCATION: 664. . .1883 
OTHER INFORMATION: 
US-09-398-496-5 

Query Match 90.5%; Score 900; DB 3; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 4.7e-225; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

| | | | | | | | | I I I I I I I I I I II I I I I I I I I I II I I M I I I I I I I I I I I I I I I I M I I I I I I 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

| | | | | | | I I M I I I I I I I I I I I I I I I I I I I M I M I I I II I I I I I I M 1 I I I I I I I M I I 
TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

GGCT^AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

| | | || I I I I I I I I I I I I I I I I I I I I I I M I I t I I I I I II I M II I I I I I I I I I I M I I I I 
GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 
| | M | | | | | | | M | | | M I I I I I 1 I I I I I I I I I M I I I I I I I II I I I I I I I I I I M I I I I 



Qy 


l 


Db 


218 


Qy 


61 


Db 


278 


Qy 


121 


Db 


338 


Qy 


181 



Db 398 
Qy 241 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 



GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

M M | I 1 I I I I | | I | I I I I I I I I I M I I I I I I I I I M I I 1 I I I I I 

Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 



Qv 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

I | || | | | | I I II I I I I I M I I I I M I I I I I I I I M I M I I I 1 I I I I I M I I I I M 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 



Db 518 



457 
300 
517 
360 
577 
420 



Qv 361 C C C CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAG GT GGG CAAGAT CCT GT GCACT GAC 

| | | | | | | | | | M I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I M I I I I I 

Db 578 C C C CT - GAT AC C AAC GGCAAAAAT CT CAAGAAAGAG GT G G G CAAGAT CCT GT GCACT GGC 636 



Qv 421 T GCGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGAC GGGACAGGT GGGT GAGAAG 

| | M I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I M I I I M I 

T G C G C C AC C C GG C C CAAGTT GAAGAAGAT GAAGAG CCAG AC GGGACAGGT GGGT GAGAAG 



Db 637 



Ov 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 

| | | || M | | | | | | I I II I II I I I II I I I I I I I I I I I M I I I I I I I I I I I M M I I 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 



Db 697 



Qv 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 

Mill II I I I I I M I II I I I II M I I I I I I I I I I I M I II I M I I I I I I M I I I I 

Db 757 GAT GGCAAGGAGCT CAAC C GC AG C C G AGAC AT T C GCAT CAAAT AT GGCAAC GGC AGAAAG 



480 
696 
540 
756 
600 
816 
660 



Q v 601 AACT CACGACTACAGTT CAACAAGGT GAAGGT GGAGGACGCTGGGGAGT AT GT CT GCGAG 

I | | || | | | | | | I I I I I I II I I I I I M I I I I I I M I I I I M I I I I M I II I I I I I I 

Db 817 AACT CACGACTACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTAT GT CT GCGAG 876 

Ov 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

| | | | | | | | | | | | || I II I I I I I I I I I I M I I I II II I M I I I M I I I I 

Db 8 77 QCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Ov 721 AC C AC C CT GT CAT CCT GGT C G G GG C AC G C C C GGAAGT GCAAC GAGAC AG C CAAGT CCT AT 780 

I m | | | | M MM M I I I I I I I I I I I I M I M M I I II M M II I I 

Db 937 AC C ACC CT GT CAT C CT G GT C GGG GC AC G C C C GGAAGT GCAAC GAGAC AG C CAAGT C CT AT 996 



Ov 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 

I I I | I | | I I I I I I M I M I I I I M I I M I I M I I I M M M I I I I I I I I Ml 

Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 

Q 841 c CAAAT GG AT T CT T C GGAC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C A 

| | | | | | M II I I I M I I I I I I I M I I I I I I I I I I II II I I I I M I I II I 

Db 1057 C CAAAT G GAT T CT T C G GAC AGAGAT GT TT G GAGAAACT GC CT T T GC GAT T GT AC AT GC C A 

Qy 901 GATCCTAAGCAAAGTGTCCT 920 

I II I I I I I I M I I Ml 
Db 1117 GATCCTAAGCAAAAGCACCT 1136 



840 
1056 
900 
1116 



RESULT 3 

US-08-525-864A-3 

; Sequence 3, Application US/08525864A 
; Patent No. 5912326 

GENERAL INFORMATION : 

APPLICANT: Chang, Han 



TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
STATE : Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 525 , 8 64A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
REFERENCE/ DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 617 ) 227-7 4 00 
TELEFAX : ( 617 ) 742-42 14 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 993 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 1..990 
US-08-525-864A-3 

Query Match 88.6%; Score 881; DB 2; Length 993; 

Best Local Similarity 93.0%; Pred. No. 3.3e-220; 

Matches 923; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 



Qy 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I M | | | M | | | I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M M I I I I 1 



Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 60 

6 1 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I | || | | | | 1 | | | | | | I I I II I M I I I I I I I I I I I I I I M M II I I M I I I I I I I I I M I 

Db 61 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

Q 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

| | | M I I II I Ml III II I I I I I I I I I I II I I I I II I I I I II 

Db 121 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 180 

Qv 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

|| | | | | | || II I I I I I I I I II I I I I I I I I I M I I I I I I I I I I M I I I I I I 

Db 181 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 




RESULT 4 

US-08-525-864A-1 

; Sequence 1, Application US/08525864A 



Patent No. 5912326 
GENERAL INFORMATION: 

APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: LAHIVE & COCKFIELD 

STREET: 28 State Street 

CITY: Boston 

STATE: Massachusetts 

COUNTRY: USA 

ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 

FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/ AGENT INFORMATION: 

NAME: Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 

REFERENCE/DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617)227-7400 

TELEFAX: ( 617 ) 742-42 14 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 3441 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 

LOCATION: 180. .2441 
US-08-525-864A-1 

Query Match 74.3%; Score 738.6; DB 2; Length 3441; 

Best Local Similarity 90.4%; Pred. No. 6.9e-183; 

Matches 789; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 



Qy 1 
Db 180 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

| | | | | | | | I I I I I I I I I I I I I I I M I I I I M I I I I I I I I M I I I I I INI 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 



0v 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I | I M MM I II II I I I II M M M I M M M II 

Db 24 0 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 2 99 

Qv 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

MIMMIMMM I I Mill IMIMM I MIMMII 

Db 300 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 359 



Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

| | 1 | | | | | | | | | | | | | | I 1 1 1 1 1 1 1 1 1 i 1 1 1 M i 1 1 1 I 1 1 1 1 1 t 1 II 1 1 M 1 1 1 1 1 1 1 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


z4 (J 


Db 


360 


/in 

4 iy 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

1 1 1 1 1 I | I I I I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


^ n n 


Db 


420 


A 1 Q 

4 / y 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

I | | | | | | | | I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 M 1 i MINIMI 

CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 


Db 


480 


539 


Qy 


361 


C C C CT C GAT AC CAAC GG CAAAAAT CT CAAGAAAGAG GT GG G CAAGAT C C T GT G C ACT GAC 

II | | | | | | M || II 1 1 1 1 1 II II 1 1 1 1 1 M 1 II 1 1 1 1 1 1 II 1 1 1 M II 1 M 1 1 

C C GGT C GAC C C T AAC GG CAAAAAC AT CAAGAAAGAGGT GG G CAAGAT CCT GT G C ACT GAC 


420 


Db 


540 


599 


Qy 


421 


T GCGCCAC CCGGC C CAAGTT GAAGAAGAT GAAGAGCCAGAC GGGACAGGT GGGT GAGAAG 

Mill 1 1 II II 1 II 1 II II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 II 1 1 1 1 II M 1 1 

T GCG CAAC C C G GC C CAAG CT GAAGAAGAT GAAGAGT CAGAC AGGAGAGGT G G G C GAGAAG 


480 


Db 


600 


659 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 

M | | M 1 1 IM 1 M M M 1 Mill M M 1 1 II 1 M 

CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


660 


719 


Qy 


541 


GAT GG CAAGGAG CT CAAC CG CAG C C GAGAC AT T CGCAT CAAAT AT GGCAAC GGC AGAAAG 

M II M 1 II 1 1 1 1 1 M II 1 II M II 1 II II II 1 1 II 1 II 1 M 1 II 1 II 1 1 1 1 1 M 

GAC G G CAAGGAG CT CAAC CG GAGT C GT GAC AT T CG CAT CAAGT AT GGCAAC GGCAGAAAG 


600 


Db 


720 


77 9 


Qy 


601 


AACT C AC GACT AC AGT T CAAC AAG GT GAAG GT GGAG GAC GCT GG GGAGT AT GT CT GC GAG 

1 1 M II 1 M II II II 1 1 1 1 M M II 1 II 1 II II M Mill II 1 1 1 1 M 

AACT C AC GGCT AC AGT T CAACAAAGT GAAG GT GGAG GAC GCT GGAGAGT AC GT CT GT GAG 


660 


Db 


780 


839 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 

| | I | | || I 1 M IMM IMM M M II 1 1 M II M 

GCT G AGAAC AT C CT T G GGAAGGACACT GT GAGGG GC C GGCT C CAT GT CAAC AGT GT GAGC 


720 


Db 


840 


899 


Qy 


721 


AC CAC C CT GT CAT C CT GGT C GG GG C AC GC C C GGAAGT G CAAC GAGAC AG C CAAGT C CT AT 

| I II 1 II 1 II 1 II 1 M 1 1 1 II 1 II 1 II II 1 1 1 M 1 1 II 1 1 1 M 1 1 M II M 

ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 


78 0 


Db 


900 


yoy 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 

|| || I I 1 M II 1 II 1 II II II 1 M M II 1 1 M 1 II 1 M 1 1 II 1 II 1 M 1 M III 

T GT GT GAAT GGAGGCGT GT GCT ACT ACAT CGAAGGCAT CAACC AACT CT CCT GCAAGT GT 


q a n 
o 4 u 


Db 


960 


1019 


Qy 


841 


C CAAAT G GAT T C T T C G GAC AG AG AT GT T T G GAG 873 

M II 1 1 1 1 M 1 1 1 1 1 1 IM 

CCTGT GGGATAC AC CG GGGACAGGT GT CAGCAG 1052 




Db 


1020 





RESULT 5 

US-08-753-007A-3 

; Sequence 3, Application US/08753007A 
; Patent No. 6074841 

GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 



TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES : 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 753 , 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-890 6 
TELEX: 

INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1607 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY : Coding Sequence 
LOCATION: 79... 621 
OTHER INFORMATION: 
US-08-753-007A-3 

Query Match 55.1%; Score 547.2; DB 3; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 3.9e-133; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

Qy 371 C CAAC GGC AAAAAT CT CAAGAAAGAG GT GG G CAAGAT C CT GT GC ACT GACT GC G C CAC C C 4 30 

| | | | | | | | | I I I I II I I I I M I I I I M I I I I I I I I I I I I I I I I M I I I I M I II I I I 
Db 2 C T AAC GG CAAAAACAT CAAGAAAGAG GT GGGCAAGAT C CT GT GC ACT GACT G C G C CAC C C 61 

Qv 431 G GC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAG GT GGGT GAGAAGCAAT C GCT GA 4 90 

|M I I I I I I I I I M I I I M III I I I I I I I M I Mill I 

Db 62 GGCCCAAGCT GAAGAAGAT GAAGAGC CAGACAGGAGAGGTGGGT GAGAAGCAGT C GCT CA 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

| | | | | | | | | | M I II II II I I I M I I I I I I I I I ' I I M I I I II I I I I 

Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 



Qy 


551 


AGCT CAACC GCAGC C GAGACATT CGCAT CAAATAT GGCAAC GGCAGAAAGAACT C ACGAC 

■ i i i i i i i i i i i i i i j | 1 | I | 1 | | 

i 1 1 1 1 1 1 1 1 ii ii n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 N i i 

AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 


610 


Db 


182 


241 


Qy 


611 


TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 

r i i i l J 1 1 1 1 1 1 1 1 [ 1 1 1 J 1 

| | I I | | | I 1 1 1 1 1 1 1 1 1 MINIMI II 1 M 1 1 M 1 Mill 1 1 M M II 1 1 1 1 1 

TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 


670 


Db 


242 


301 


Qy 


671 


TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 

I 1 I 1 M M 1 II 1 Mill II 1 1 1 1 1 1 M II 1 1 M 1 1 M 1 II II II 

TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 


730 


Db 


302 


361 


Qy 


731 


CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 

I | I | | | | t I 1 1 1 1 1 M M M 1 1 N 1 1 II II 1 M 1 II 1 II II 1 1 1 1 

CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 


790 


Db 


362 


421 


Qy 


791 


GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 

I M II 1 1 1 1 II 1 1 1 1 M 1 M M M 1 II M N 1 1 1 II 1 M 1 M II INI 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 


850 


Db 


422 


481 


Qy 


851 


TCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGATCCTAAGC 

I M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 IN 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 

T CT T C G GAC AGAGAT GT T T GGAGAAACT GC CT T T GCGATT GT AC AT GCC AGAT C CT AAGC 


910 


Db 


482 


541 


yy 


911 


AAAGT GT CCT GT GGGAT AC AC C GGGGACAGGT GT CAGCAGTT CGCAAT GGT CAACTT CT C 

I I 1 M 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 M 1 1 1 1 

AAAGT GT C CT GT GGGAT ACAC C GGGGACAGGT GT CAGCAGTT CGCAAT GGT CAACTT CT C 


970 


Db 


542 


601 


Qy 


971 


CAAGCACCTTGGATTTGAATTAAA 994 

II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 

C AAG C AC C T T G GAT T T GAAT T G AA 625 




Db 


602 





RESULT 6 
US-09-398-496-3 

; Sequence 3, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/ 09/398 , 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1607 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: Coding Sequence 
LOCATION: 7 9... 621 
OTHER INFORMATION: 
US-09-398-496-3 

Query Match 55.1%; Score 547.2; DB 3; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 3.9e-133; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 
Ov 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I || Ml || || || M I I I M I I I I I M I I I I I I I M M I I I I II I I I I M I I I 11 I I I 

Db 2 CT AAC GGCAAAAAC AT CAAGAAAGAG GT G G GC AAGAT C CT GT GC ACT G ACT GC GC CAC CC 61 

Qv 4 31 GGC C CAAGT T GAAGAAGAT GAAGAGC C AGACGGGACAGGT G G GT GAGAAGCAAT C GCT GA 490 

Mill; Ml I I I I M I I I I I I I IN I I I I M I I I I I I I I M I I I M I 

Db 62 GGCCCAAGCT GAAGAAGAT GAAGAGCCAGACAGGAGAGGT GGGT GAGAAGCAGT CGCT CA 121 



Ov 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 

| | | | | || | | | || | || I I I I I I I I I M I II I I I I I I I I I I I I I II I I M 

Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 



550 
181 
610 



Ov 551 AGCTCAACCGCAGCCGAGACATT CGCATCAAATAT GGCAACGGCAGAAAGAACTCACGAC 

| || | | | | | | || II MINIM II I I I M I I M I I I M I 

Db 182 AACT CAACCGGAGT CGTGAT AT T CGCAT CAAGT AT GGCAAT GT CAGAAAGAACT CAC GGC 241 



670 



Qy 611 T ACAGTT CAACAAG GT GAAGGT GGAG G AC GCT GGGGAGT AT GT CT G C GAG GC C GAGAAC A 

| | | || M I II I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 242 T AC AGTT CAACAAAGT GAG G GT GGAGGAT GC C G GGGAGT AC GT CT GT GAGGC C GAGAACA 301 



730 



Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 

| | || II I I I I I II I I I M II I II II M I II II I I I M I I I I I I I I M I I I M 

Db 302 T C C TT GG GAAGGAC AC C GT GAGG G GC C GACT C CAT GT CAAC AGC GT GAGC AC CAC T C T GT 361 



Qy 


731 


CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 

| | | | | | I I 1 I I I 1 II 1 MINIM Mill IIMMIMM II M MM 

CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 


i y U 


Db 


3 62 


A O 1 
4 Z 1 


Qy 


791 


GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 

|| I M 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M M M II 1 1 M 1 1 1 1 1 1 1 1 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 


OjU 


Db 


422 


4 0 1 


Qy 


851 


T CT T C G GACAGAGAT GT T T G GAGAAAC TGCCTTTGC GAT T GT ACAT GC C AG AT C CT AAGC 

|| M || | I I 1 II II II 1 1 M 1 M 1 M II 1 II 1 II M II 1 II 

TCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGAl <~C 1 AAfcr<~ 


q i n 


Db 


482 


■J M -L 


Qy 


911 


AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 

| | | | || | | | || | I I I I 1 I 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 


970 


Db 


542 


601 


Qy 


971 


CAAGCACCTTGGATTTGAATTAAA 994 

1 I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M M 
CAAGCACCTTGGATTTGAATTGAA 625 




Db 


602 





RESULT 7 

US-08-753-007A-7 

; Sequence 7 , Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

; STATE: MA 

COUNTRY: US 
ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/7 53 , 007 A 

; FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 



TELEX: 

INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1476 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME /KEY: Coding Sequence 
LOCATION: 69 . . . 1475 
OTHER INFORMATION: 
US-08-753-007A-7 

Query Match 49.5%; Score 492; DB 3; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 9.1e-119; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Ov 363 C CT C GAT AC C AACG GCAAAAAT C T CAAGAAAGAGGT GG GCAAGAT C CT GT GC ACT GACT G 422 

M II I II IM I IN M I I II I I II 

Db g 8 C C GC GG CAAGAAGC AC C C AG AGGGGAGGAAG C GG G AGAG GGAGC C C GAT CC C GGGGAGAA 

O v 423 C GC CAC C C GGC C C AAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AG GT GGGT GAGAAGC A 

I | | | M | | | | | | | I I II I I I I I M I M I I I I I I I I M I I I I I I M I I I I M I I I I I I I I 

Db 158 AG C CAC C C GG C C CAAGT T GAAGAAGAT GAAGAGC C AGAC G GGAC AG GT GGGT GAGAAGC A 

Ov 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 

| | | | | | | | | | I I I I I I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 

Ov 543 T G GCAAGGAG CT CAACC GC AGC C GAGAC AT T C GC AT CAAAT AT GG CAAC G G C AGAAAGAA 

Y I I I I I I I I I I M I I M I I I I I I I I I I I II I I I I I I I I I I M I I M I 

Db 27 8 T GGC AAGGAGCT CAACC GC AGC C GAGAC AT T C G CAT CAAAT AT GGCAAC GGC AG AAAGAA 

Ov 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 

| | | | | | | | | | | || | I I II I I I II I I I I I I I M I I I I I I I I M I I I I I I I 

Db 338 CT CAC GACT AC AGT T CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTAT GT CT GCGAGGC 

Ov 663 CGAGAACATCCTGGGGT^AGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 

Y | | | | I I I M I I I I I I I II I I I I I I I I I M I I I I I I M I M I I II M I I M 

C GAGAAC AT C C T GGGGAAGGAC AC CGTCCGGGGC C GGC TT T AC GT CAAC AGC GT GAGC AC 



Db 398 
Qy 723 



C AC CCT GT CAT C C T G GT C GG GGCAC G C C C G GAAGT G CAAC GAGAC AGC CAAGT C CT AT T G 

I | | | | | | M | | | M I I I I I I I M I M M I I I I I I I I I M I I I I I M I I I I M I II 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 



Qy 783 
Db 518 



C GT CAAT GGAGGCGTCTGCTACTACATC GAG GGCAT CAAC CAGCTCTCCTG CAAAT GTCC 

I M | | | || I I II I I I I I I I I I I I I I I M I I I I I I I I M II I I I I I I I I I I I M M 

C GT CAAT GGAGGCGTCTGCTACTACATCGAGGGCAT CAAC CAGCTCTCCTGCAAAT GTCC 



Ov 84 3 AAAT GGAT T CT T C GGAC AGAGAT GT T T GGAGAAACT G C CT T T GCGAT T GT AC AT GC C AGA 

| | | | | | | | | | | I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II 
Db 57 8 AAAT G GAT T CT T C G G AC AGAGAT GT T T G GAGAAACT GC CT T T GCGAT T GT AC AT G C C AG A 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I II I I I I I M 
Db 638 TCCTAAGCAAAAAGCC 653 



157 
482 
217 
542 
277 
602 
337 
662 
397 
722 
457 
782 
517 
842 
577 
902 
637 



RESULT 8 
US-09-398-496-7 

; Sequence 7, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/ 09/ 398 , 4 96 

FILING DATE: 
; CLASSIFICATION: 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
; FILING DATE: 19-NOV-1996 

; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1476 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS : single 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 
; FEATURE: 

; NAME/KEY: Coding Sequence 

; LOCATION: 69... 1475 

; OTHER INFORMATION: 

US-09-398-496-7 

Query Match 49.5%; Score 492; DB 3; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 9.1e-119; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 



Qy 


363 


CCT CGAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GACT G 

1 1 1 1 i 1 1 iii i 1 1 1 1 1 i i ii i i 1 

CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 


422 


Db 


98 


157 


Qy 


423 


CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 

. , , i < i i i i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

I | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 


482 


Db 


158 


217 


Qy 


483 


ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 

i i i i i i i l l 1 1 1 1 1 1 1 1 1 1 1 

I 1 1 M 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 


542 


Db 


218 


277 


Qy 


543 


TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 

Iltllllllllllllllltll 

M 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 
TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 


602 


Db 


278 


337 


Qy 


603 


CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 

| | | | | | | | I I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 M 
CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 


662 


Db 


338 


397 


Qy 


663 


CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 

i 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 

I M M | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 


722 


Db 


398 


457 


Qy 


723 


CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 

| | | | 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 


782 


Db 


458 


517 


Qy 


783 


CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 

, 1 i 1 1 | 1 | | 1 1 1 1 I I 1 1 1 1 1 1 1 1 | | 1 1 t 1 | | | | | 1 1 

I I 1 | | | | | | | I I I M 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 M M 1 ! 1 1 II II 1 I! 1 M M M ii M i 

CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 


842 


Db 


518 


577 


Ov 


843 


AAAT GGAT T CTT C GGAC AGAGAT GT T T GGAGAAACT G CCT T T GC GAT T GT ACAT GC C AGA 
| | | | | | | | I I I M 1 1 1 M M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 

AAAT GGAT T CT T C GGAC AGAGAT GT T T GGAGAAACTGC C T T T GC GAT T GT AC AT G C C AGA 


902 


Db 


578 


637 


Qy 


903 


TCCTAAGCAAAGTGTC 918 

1 1 1 1 1 II 1 1 1 1 II 
TCCTAAGCAAAAAGCC 653 




Db 


638 





RESULT 9 

US-08-753-007A-31 

;. Sequence 31, Application US/08753007A 
; Patent No. 6074841 

GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; APPLICANT: Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 



COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/753 , 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 31: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2268 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: Coding Sequence 
LOCATION: 69. . .2009 
OTHER INFORMATION: 
US-08-753-007A-31 

Query Match 49.5%; Score 492; DB 3; Length 2268; 

Best Local Similarity 92.8%; Pred. No. l.le-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

363 CCT CGAT ACCAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GACT G 422 

Ml I I I I I I I II 

Db 98 C C G CGGCAAGAAGC AC C C AGAG GG GAGGAAG C GGGAGAG G GAGC C C GAT C CC GG G GAGAA 157 

Ov 423 C G C C AC C C G GC C CAAGT T GAAGAAG AT GAAGAG C CAGAC GGGACAGGT GG GT GAGAAGCA 482 

| | I I I I M M I I I I M I I I I I M I I I M I I i I I I I M I II I I I I I I M I M I I II I I I I 

AGCCACCCGGC CCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 217 



Db 158 



TGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 



Qy 483 ATCGC. 

I | | | | | | I | I I I I I I I I I I I I I I I II I I M i I I I II N I I I m[M^M I M 

Db 218 AT 



CGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 



0v 543 T GGCAAGGAGCT CAAC C G C AGC C GAG AC AT T C G CAT CAAAT AT GG CAAC GGC AGAAAGAA 602 

| | | | | | | | | | I I I I I I I I I I I I I II I I I I II I II I I I I I I I I I II I I I M I I I M 

T G G C AAG GAG C T CAAC C G C AG C C GAG AC AT T C G CAT CAAAT AT G G CAAC G G C AG AAAG AA 337 



Db 278 



Qv 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II li 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


338 


CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGlt,l^UL.AL,t,L 




Qy 


663 


CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 

| 1 I | 1 | 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 I t 1 1 > 1 > i > ■ > 

CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 


19? 


Db 


398 


& S7 


Qy 


723 


CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 

MUM | I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGI LCI Al 1 ^ 




Db 


458 


^1 7 


Qy 


783 


CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 

1 1 M | M | M I 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 M 1 | I I 1 1 1 1 M 1 1 1 1 1 1 

«„«« mr , mrl ^m7\nm7\^7\mr , r7\rrrr , ATraArrAnrTrTrrTGCAAATGTCC 
CGTCAATGGAGGCGTCTGCTACTACATCCjAbbbCAlu/\/\^^/\^oi ^i^^i^^-^^- 1 - ^ x 


84 9 


Db 


518 


577 


Qy 


843 


AAAT GGAT T CT T C GGAC AGAGAT GT T T GGAGAAACT GC CT T T GC GAT T GT AC AT GC C AGA 

I | | | | | | | | | | | M 1 1 1 1 1 M 1 1 i II M I 1 1 1 1 M 1 1 1 1 M II M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 

AAAT G GAT T CT T C GGAC AG AGAT GT T T G GAGAAACT G C CT T T GC GAT T GT AC AT GC C AGA 


902 


Db 


578 


637 


Qy 


903 


T CCT AAGC AAAGT GT C 918 

1 1 1 1 1 II II 1 1 M 
TCCTAAGCAAAAAGCC 653 




Db 


638 





RESULT 10 
US-09-398-496-31 

; Sequence 31, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

; COUNTRY: US 

; ZIP: 02110-2804 

COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 398 , 4 96 
; FILING DATE: 

; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 



REFERENCE/ DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: 

INFORMATION FOR SEQ ID NO: 31: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

; FEATURE : 

; NAME /KEY: Coding Sequence 

LOCATION: 69... 2009 

OTHER INFORMATION: 
US-09-398-496-31 

Query Match 49.5%; Score 492; DB 3; Length 2268; 

Best Local Similarity 92.8%; Pred. No. l.le-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

363 C CT C GAT AC CAAC G G C AAAAAT CT C AAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT G 422 

| | | | | I I III I I I I I I I I I 

98 C C GC G GC AAGAAGC AC CC AGAG G GGAG GAAG C GGGAGAGGG AG C C C GAT CC C GGGGAGAA 157 

42 3 CGCCACC C GGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 482 

| | | | | | 1 | | | | I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I i I I M 

158 AGCC AC C C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGAC AGGT GGGT GAGAAG C A 217 

483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I | | | M I I 1 II M I I II M I M I I I I M I I II I I I I II II I I I M I I I I I I I I I 

218 ATCGCTGT^AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

543 T G G CAAG GAGCT CAAC C G C AGC C GAGACAT T C GC AT CAAAT AT GGCAAC G GCAGAAAGAA 602 

Ml | | | I I I I I I I II I I I I I I I I I I I I I I I I 

278 T GG CAAG GAGCT CAAC C GCAGC C GAGACAT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 337 

603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

| | | | | | | | | | | | I I I I I I I I II I I I I I I I I I I I I M I I I I I I I M II I I I I I I I I I I I M 
338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

| | | | | I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M II I M I M I I 

398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

| | | | | | | I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I M I M 

458 CACCCTGTCAT CCT GGT CGGGGCACGCCCGGAAGT GCAACGAGACAGCCAAGT CCTATT G 517 

7 83 CGT CAAT GGAGGC GT CT GCTACTACAT CGAGGGCAT CAAC CAGCT CT C CT GC AAAT GT CC 842 

| || | | | | || | | | M I I I I I 1 I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

518 C GT CAAT G GAGGCGTCT GCTACTACAT CGAGGGCAT CAAC CAGCTCTCCTG CAAAT GTCC 577 

843 AAAT GG AT T CT T CG G AC AGAGAT GT T T GGAGAAACT GC CTT T GC GAT T GT AC AT GC C AGA 902 

| | | | | | | | | | | | 1 I I I I I II I I II I I I I I I M I I I I I I I I I M I I I I II I M I I I 

578 AAAT G GAT T C T T C G G AC AGAGAT GT T T GGAGAAACT GCCTTTGC GAT T GT AC AT GC C AG A 637 



QY 
Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I II I I I II 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 11 
US-08-753-007A-1 

; Sequence 1, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 
; FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 1: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 2467 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: circular 

; MOLECULE TYPE: cDNA 
; FEATURE : 

; NAME/KEY: Coding Sequence 

LOCATION: 79... 1893 

OTHER INFORMATION: 
US-08-753-007A-1 



Query Match 



47.1%; Score 467.8; DB 3; Length 2467; 



Best Local Similarity 91.3%; Pred. No. 2.3e-112; 

Matches 496; Conservative 0; Mismatches 47; Indels 0; Gaps 



Qy 


371 


CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 

| || II M Mill II II II || || Mill Mill IIMIM Ml MINI II 

CTAAC GGCAAAAACATCAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GACT GCGC CACCC 


430 


Db 


2 


61 


Qy 


431 


GGC C CAAGT T GAAGAAGAT GAAGAG C C AGAC GGGAC AG GT GGGT GAGAAGCAAT C GCT GA 

MIIIMI IMIMI Ml Ml 1 M 1 1 1 II II 1 1 II II 1 

GGCCCAAGCT GAAGAAGAT GAAGAGCC AGACAGGAGAGGT GGGT GAGAAGCAGT C GCT CA 


490 


Db 


62 


121 


Qy 


491 


AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 

| | | | | | M II II 1 II II II MIIIMI II 1 M II 1 M MM M 

AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 


550 


Db 


122 


181 


Qy 


551 


AGCT CAACCGCAGCCGAGACATTCGCAT CAAAT AT GGCAACGGCAGAAAGAACT CACGAC 

| | | | | | 1 M II II II M M 1 II 1 1 M 1 II 1 1 1 M II 1 1 1 1 

AACT C AAC C G G AGT C GT GAT AT T C GC AT CAAGT AT GG CAAT GT C AGAAAGAACT C ACGGC 


610 


Db 


182 


241 


Qy 


611 


T ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGCCGAGAACA 

MIIIMI MM IMIIMM M Ml Ml IMIIMIM 

T AC AGT T CAACAAAGT GAG G GT GGAG GAT GC C GGGGAGT AC GT CT GT GAGGC C GAGAAC A 


670 


Db 


242 


301 


Qy 


671 


TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 

MM II II M 1 M M 1 II IMIIMM 1 MM IMIIMM 

T C CTTGGGAAGGACACCGTGAGGGGCC GACT C CAT GTCAACAGCGTGAGCACC ACT CTGT 


730 


Db 


302 


361 


Qy 


731 


CAT C CT G GT C GGGG C AC G CC C GGAAGT GCAAC GAGACAGC CAAGT C CT AT T G C GT CAAT G 

| | | | | | M II 1 1 1 II II 1 1 II M M II 1 1 1 M II 1 II 1 M 1 

CAT C CT G GT C GG GAC AT GCC C GGAAGT GCAAT GAGACC G C CAAGT C CT ACT GT GT GAAT G 


790 


Db 


362 


421 


Qy 


791 


GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 

t i i i i i i i i i i i i i i i i t i i i i i i i 1 t 1 1 1 1 1 1 1 1 1 1 1 1 till 

| | | | | | | | | | | | | I I M 1 1 1 1 II 11 1 1 M 1 II 1 1 1 M 1 M M 1 1 1 1 1 II 1 1 1 l l Mil 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC7VAACGGAT 


850 


Db 


422 


481 


Qy 


851 


T CT T C GGACAG AGAT GT T T G G AGAAACT GC CT T T GC GAT T GT ACAT GC CAGAT C CT AAGC 

I | | | | | || || || II M 1 1 1 M 1 M 1 1 M 1 1 1 M 1 II 1 M II 1 II 1 M 1 M II M 

T CT T C GGACAG AGAT GT T T G G AGAAACT GC CT T T GC GAT T GT ACAT GC CAGAT C CT AAGC 


910 


Db 


482 


541 


Qy 


911 


AAA 913 




Db 


542 


1 1 1 

AAA 544 





RESULT 12 
US-09-398-496-1 

; Sequence 1, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 



CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/398 , 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2467 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: circular 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: Coding Sequence 
LOCATION: 79. . .18 93 
OTHER INFORMATION: 
US-09-398-496-1 



Query Match 47.1%; 
Best Local Similarity 91.3%; 
Matches 496; Conservative 



Score 467.8; DB 3 
Pred. No. 2.3e-112 
0; Mismatches 47 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



371 



Length 24 67; 

Indels 0; Gaps 0; 

430 



2 CT- 



C C AAC G G CAAAAAT CT CAAGAAAGAG GT G G G CAAGAT C CT GTG C ACT GACT GC GC C AC C C 

i | | | | | | | M II I I M I II I I I II I I I II M I I I I I I I I I I I I I I 

AAC GGCAAAAAC AT CAAGAAAGAG GT G GGCAAGAT C CT GT GCACT GACT GC G C C ACC C 61 



490 



431 G G C C C AAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT G GGT GAGAAG C AAT C GC T GA 

I | | | | M I I I I I II I I M I I I Ml IN I I I I I I I I I I I MM I 

jC C CAAG CT GAAGAAGAT GAAGAG C C AG AC AG GAGAG GT GGGT GAGAAGC AGT C G CT C A 121 



62 GGC 



550 



4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 

| | | | | | | | | | | | | || || II I II I II II Mill II I I II I II I I I M I M I M I 
AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 



122 

551 AGCT C AAC C GC AG C C GAGACATT C G CAT C AAAT AT G G CAAC G GCAGAAAGAAC T C AC GAC 

| | | M I I I I M II II M M II I II M I II I II I I I M I I I I I II M II II I 



610 



Db 


182 


Qy 


611 


Db 


242 


Qy 


671 


Db 


302 


Qy 


731 


Db 


362 


Qy 


791 


Db 


422 


Qy 


851 


Db 


482 


Qy 


911 


Db 


542 



AACT CAAC C GGAGT C GT GAT AT T C G CAT C AAGT AT GGCAAT GT C AGAAAGAACT CAC GGC 
T AC AGT T CAACAAGGT GAAGGT G GAG GAC GCT G G GGAGTAT GT CT GC GAGG C C GAGAAC A 

| | | I | M I I I M I MM I II I I II I I M IMIIMI IMMMMIMI 

T AC AGT T CAACAAAGT GAG GGT G GAG GAT GC C G G GGAGT AC GT CT GT GAGGC C GAGAAC A 
TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 

IIIIMIIIIMI MUM I M I 11 I t I I I t M I 



CAT C C T GGT C G GGG C AC GC C C G GAAGT GCAAC GAGAC AGC CAAGT C CT ATT G C GTCAAT G 

IIIIMIIIIMI M M I I II M I Mill I I I II I M II I II II I I I I 

CAT C CT GGT C G GGAC AT GC C C G GAAGT GCAAT GAGAC C GC CAAGT C CT AC T GT GT GAAT G 
GAGGCGT CT GCT ACT ACAT CGAGGGC AT CAAC CAGCT CT C CT GCAAAT GT C CAAAT GGAT 

I | M || | | | M I I II I I I M M I I II II I I I M II I M I I I I I M M I I I M I I II I I 

GAGG C GT GT GC TACT AC AT C GAG G GC AT CAAC CAGCT CT C CT GCAAAT GT C CAAAC GGAT 



M I I I I I II I II I I M I I M I I M I I I M I I I M I I I I I I M M I II I I I M II I I M I I 

T CT T C GGACAGAGAT GT T T GGAGAAAC T GC CT TT GC GATT GT AC AT GC C AGAT C CTAAGC 



241 
670 
301 
730 
361 
790 
421 
850 
481 
910 
541 



RESULT 13 
US-08-525-864A-5 

; Sequence 5, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
; TITLE OF INVENTION: Related thereto 
; NUMBER OF SEQUENCES: 18 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: LAHIVE & COCKFIELD 

; STREET: 28 State Street 

; CITY: Boston 

; STATE: Massachusetts 

COUNTRY: USA 

ZIP: 02109 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: AscII (text) 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/525, 8 64A 

FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 
; REFERENCE/DOCKET NUMBER: HUI-017 

TELECOMMUNICATION INFORMATION: 



TELEPHONE: (617)227-7400 
; TELEFAX: (617)742-4214 

; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1207 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: double 

; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

FEATURE: 
; NAME/ KEY: CDS 

LOCATION: 2. .394 
US-08-525-864A-5 



Query Match 36.2%; Score 359.6; DB 2; Length 1207; 

Best Local Similarity 94.0%; Pred. No. 2.7e-84; 

Matches 374; Conservative 0; Mismatches 24; Indels 0; 



QY 


597 


AAAGAACT CAC GACT AC AGT T C AACAAG GT GAAG GT G GAGGAC G CT G GGGAGT AT GT CT G 656 

I 1 | I I I I I 1 1 I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 Mill 

AAAGAACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTG 60 


Db 


1 


Qy 


657 


CGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGT 716 

Ml I'll IMIIM MUM 1 Mill 

TGAGGCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGT 12 0 


Db 


61 


QY 


717 


GAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTC 77 6 

1 I 1 | M 1 1 1 1 M M 1 M 1 1 1 1 1 1 1 1 M 1 M M 1 1 M 1 1 M 1 1 M II 1 M 1 II 

GAGCACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTC 18 0 


Db 


121 


Qy 


777 


CTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA 836 

Ml M || 1 | 1 1 1 t 1 1 1 I 1 MIMMMIIMI 1 M 1 1 M 1 1 1 1 1 M M II 1 1 1 1 

CTACT GT GT GAAT GGAGGCGT GT GCT ACTACAT CGAAGGCAT C AACCAACT CT CCT GCAA 240 


Db 


181 


Qy 


837 


AT GT C CAAAT GGAT T CT T C GGACAGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT 896 

MINIM Mill MM M Mill Ml III Mill MM 

AT GT C CAAAC GG AT T CT T C GGACAG AGATGT T T GGAGAAACT GCCTTTGC GAT T GT ACAT 300 


Db 


241 


Qy 


897 


GCCAGATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCA 956 

MUM IMM II II III II MIMI MM III Mill Ml Mill II II 

GCCAGATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCA 360 


Db 


301 


Qy 


957 


AT G GT CAACT T CT C CAAG CAC C T T GGATT T GAAT TAAA 994 

I | | | | M I 1 II II II 1 1 M 1 1 1 1 1 M M II M 1 1 1 1 II 

AT GGT CAACT TCTCCAAGCACCTTGGATTTGAATT AAA 398 


Db 


361 



RESULT 14 
US-08-525-864A-18 

; Sequence 18, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

; APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and 

TITLE OF INVENTION: Related thereto 

NUMBER OF SEQUENCES: 18 

CORRESPONDENCE ADDRESS: 



ADDRESSEE : LAHIVE & COCKFIELD 
STREET: 2 8 State Street 
CITY: Boston 
STATE : Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/525, 8 64 A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
REFERENCE/ DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 617 ) 227-7 400 
TELEFAX: ( 617 ) 742-42 14 
INFORMATION FOR SEQ ID NO: 18: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 142 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear - 
MOLECULE TYPE: cDNA 
US-08-525-864A-18 

Query Match 13.6%; Score 135.6; DB 2; Length 142; 

Best Local Similarity 97.2%; Pred. No. 2.8e-26; 

Matches 138; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 
Qv 792 AGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGATT 851 

| I I I I I I I I I I I I I I I I I I I M I I II I I II I I I I I I M I I II II Mill 

Db i AGGCGTGTGCTACTACATCGAAGGCATCAACCT^ACTCTCCTGCAAATGTCCAAACGGATT 60 

q v 852 CT T C GGAC AGAGAT GT TT G GAGAAACT GC CT T T GC GAT T GT AC AT GC C AGAT C CT AAGCA 911 

I I M I I M | I I I I I I I I I I I I I I I I I M I i I I I M I I M I I M I I N I I I M I I M I I I I 

Db 6i CT T C GGAC AGAGAT GT T T G GAGAAACT GC CT T T GC GAT T GT ACAT GC C AG AT C CT AAGCA 12 0 

Qy 912 AAGT GT C CT GT GG GAT AC AC C G 933 

I II II I I I I I M I II I I I I I I I 
D b 121 AAGT GT C CT GT G G GAT ACAC C G 142 



RESULT 15 

US-08-036-555B-149 

; Sequence 149, Application US/08036555B 

; Patent No. 5530109 

; GENERAL INFORMATION: 

; APPLICANT: Goodearl, Andrew; Stroobant, Paul; 

; APPLICANT: Minghetti, Luisa; Waterfield, Michael; Marchioni, Mark; 
; APPLICANT: Chen, Maio Su; Hiles, Ian 

TITLE OF INVENTION: Glial Mitogenic Factors, Their 



TITLE OF INVENTION: Preparation and Use 
NUMBER OF SEQUENCES: 184 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Felfe & Lynch 
STREET: 8 05 Third Avenue 
CITY: New York City 
STATE: New York 
COUNTRY: USA 
ZIP: 10022 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 5.25 inch, 360 kb storage 
COMPUTER: IBM 
OPERATING SYSTEM: PC-DOS 
SOFTWARE : Wordper f ect 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 036, 555B 
FILING DATE: 24-MAR-1993 
CLASSIFICATION : 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/965,173 
FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 
FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 
FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
FILING DATE: 03-APRIL-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: U.K. 91 07566.3 
FILING DATE: 10-APRIL-1991 
ATTORNEY/ AGENT INFORMATION: 
NAME: Tsai, Christine H. 
REGISTRATION NUMBER: 34,266 
REFERENCE/ DOCKET NUMBER: LUD 5250.4 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 688-9200 
TELEFAX: (212) 838-3884 
INFORMATION FOR SEQ ID NO: 149: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1140 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
US-08-036-555B-149 

Query Match 9.6%; Score 95.4; DB 1; Length 1140; 

Best Local Similarity 50.7%; Pred. No. 1.9e-15; 

Matches 409; Conservative 0; Mismatches 361; Indels 37; Gaps b; 
Q 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 253 

| | | | | | | | || I I I I I I I I I I I I I I I > I I I I 

Db ii GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 7 0 



Qy 254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 



313 



I I II 1 1 1 1 1 1 I 1 1 III I III 'Ill I 

Db 71 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 130 

Ov 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I I I I I M I I I I I I I I Ml 

TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 



Db 131 
Qy 373 



Db 



190 
423 



AACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGC 

I I M I I M I I I I I I I I I M II I I I I I I I I I I I I I 
Db 191 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 249 



Qy 424 GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 

| M | | M I I I I I I I I I I I I I I I I II I H HI I I I I 

Db 250 GCCTTGCCTCCCCGCTTGAAAGAGATGAAGAGTCAGGAGTCTGTGGCAGGTTCCAAACTA 



483 
309 
543 



Ov 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 

Ml Ml III I II I II I HI 'I 

Db 310 GTGCTTCGGTGCGAGACCAGTTCTGAATACTCCTCTCTCAAGTTCAAGTGGTTCAAGAAT 369 



Qv 544 GGCAAGGAGCTCAACCG— CAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

|| | III I III I I I I I I I I I HI I 

Db 370 GGGAGTGAATTAAGCCGAAAGAACAAACCAGAAAACATCAAGATACAGAAAAGGCCGGGG 



Ov 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 

| | | | | | | | I I I I I I I I Ml II I I II I I I I I I I I I I 

AAGTCAGAACTTCGCATTAGCAAAGCGTCACTGGCTGATTCTGGAGAATATATGTGCAAA 



Db 430 



429 
660 
489 



Qy 661 GCCGAGAACATCCTGGGGAAGGACA CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

| I II I I II I I I I I I IM 

GTGATCAGCAAACTAGGAAATGACAGTGCCTCTGCCAACATCACCATTGTGGAGTCAAAC 



Db 490 

7 ! 5 GTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAG 

|| ||M Ill II I I I I IMM MM I II 

Db 550 GCCACATCCACATCTACAGCTGGGACAAGCCATCTTGTCAAGTGTGCAGAGAAGGAGAAA 



549 
774 
609 



Qv 775 TCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTC 830 

|| | | II I M I II M II I M I I I I I I I I I I II 

ACTTTCTGTGTGAATGGAGGCGAGTGCTTCATGGTGAAAGACCTTTCAAATCCCTCAAGA 



Db 610 



669 



0v 831 cTGCAAATGTCCAAATGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTG 885 

HIM M I II I M I I I II I I I I II I I MM II 

TACTTGTGCAAGTGCCAACCTGGATTCACTGGAGCGAGATGTACTGAGAATGTGCCCATG 



Db 67 0 



729 
945 



886 CGATTGTACATGCCAGATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTC 

MM I I II I I I M II II I I II II I I I M I 

730 AAAGT C C AAA CCCAAGAAAAGTGCCCAAATGAGTTTACTGGTGATCGCTGCC 781 



Qy 94 6 AGCAGTTCGCAATGGTCAACTTCTCCA 972 

I I I I I I II M II I M II M 
Db 782 AAAACTACGTAATGGCCAGCTTCTACA 808 



Search completed: August 15, 2004, 09:45:30 
Job time : 94.9371 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: August 15, 2004, 08:02:19 ; Search time 536.161 Seconds 

(without alignments) 
9096.466 Million cell updates/sec 



Title: US-09-8 64-675-1 

Perfect score: 994 

Sequence: 1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3225727 seqs, 2453303834 residues 

Total number of hits satisfying chosen parameters: 6451454 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



Database : Published_Applications_NA: * 

1: /cgn2_6/ptodata/2/pubpna/US07_PUBCOMB.seq:* 
2 : /cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB. seq: * 
3 : / cgn2__6/ptodata/2 /pubpna/US 0 6_NEW_PUB . seq : * 
4: /cgn2_6/ptodata/2/pubpna/US06_PUBCOMB.seq:* 
5 : / cgn2_6/ptodata/ 2 /pubpna/US07_NEW_PUB . seq : * 
6: /cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB.seq:* 
7 : /cgn2_6/ptodata/2/pubpna/US08_NEW_PUB. seq: * 
8 : /cgn2_6/ptodata/2/pubpna/US08_PUBCOMB. seq: * 
9 : /cgn2 j5/ptodata/2/pubpna/US09A_PUBCOMB . seq: * 
10: / C gn2_6/ptodata/2/pubpna/US09B_PUBCOMB.seq:* 
11 : /cgn2_6/ptodata/2/pubpna/US09C_PUBCOMB. seq: * 
12: /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB.seq:* 
13 : /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB . seq2 : * 
14 : / C gn2_6/ptodata/2/pubpna/USl0A_PUBC0MB.seq:* 
15: /cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB.seq:* 
16: /cgn2_6/ptodata/2/pubpna/US10C_PUBCOMB.seq:* 
17: /cgn2_6/ptodata/2/pubpna/US10_NEW_PUB.seq:* 
18 : /cgn2_6/ptodata/2/pubpna/US60_NEW_PUB. seq: * 
19 : /cgn2_6/ptodata/2/pubpna/US60_PUBCOMB . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-864-675-1 

; Sequence 1, Application US/09864675 
; Patent No. US20020081286A1 
; GENERAL INFORMATION: 



; APPLICANT: Marchionni, Mark 

; TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/09/864,675 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/206,495 

; PRIOR FILING DATE: 2000-05-23 

; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 
; LENGTH: 994 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-864-675-1 

Query Match 100.0%; Score 994; DB 9; Length 994; 

Best Local Similarity 100.0%; Pred. No. le-280; 

Matches 994; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

| | M | I M | 1 I I I I I I I I I I I I I I I I I I I M I I I I I I M I I 1 I II I 

1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

|| | | | | | | | | | | I I I I I I I I I II II II I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

|| | | | I I I I I I I II I II I I I I I I I I I I I I I I I M I II I I I I I I I I I I M I I I I I I I M M 
121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

| | || | | M I I II I I I I I I I I M I II I M I M I I I I I I I I I I M I I I I I I M I I I I I I I I I 
181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

MIMI IN I I I I I I I I I II I I I I I M I I I I I I I I I I I II M I I I I I 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

| | | I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I 
301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

361 CC CCT C GATACCAACGGCAAAAAT CTCAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GAC 420 

| | M | | I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 C CCCT C GATACCAAC GGCAAAAAT CTCAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 420 

421 T GC GC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAG C C AGAC G GGAC AG GT GGGT GAGAAG 480 

|| | | | | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
421 T GC GC C AC C C G G C C CAAGT T GAAGAAGAT GAAGAGC C AGAC GG GAC AGGT GGGT GAGAAG 480 

481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I M I M I I I I 
481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 
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Db 

Qy 

Db 

Qy 

Db 

Qy 
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Qy 
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Qy 



541 GAT G G CAAG G AGCT CAAC C G CAGC C GAGACAT T C GC AT C AAAT AT G G CAAC G GC AGAAAG 600 
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541 



600 
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661 
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Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 



AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I ; I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AACT CAC GAC T AC AGT T CAACAAG GT GAAGGT GGAG GAC GCT G G G GAGT AT GT CT G C GAG 660 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I M I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I M M I I I I I I 1 I M I M I I I I I I I 
GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 
I | | I | I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II 
AC CAC C C T GT CAT CCTGGTCGGGG CAC G C C C G GAAGT GCAAC GAGAC AG C CAAGT C CT AT 7 8 0 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 8 4 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

C CAAAT GGATT CTT C GGAC AGAGAT GT T T GGAGAAAC T GCCT T T GC GAT T GT AC AT G C C A 900 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 ii 1 1 1 1 1 1 i 1 1 1 

C CAAAT G GAT T CTT C GGAC AGAGAT GT T T GGAGAAACT GCC T T T GC GAT T GT AC AT GCC A 900 

GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

TCAACTTCTCCAAGCAC CTT GGATT TGAATTAAA 994 

I I I I I I I I I I I I I I I II II I I I I I M II I II I I I 

TCAACTTCTCCAAGCAC CTT GGATT TGAATTAAA 994 



RESULT 2 
US-10-096-241-5 

; Sequence 5, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

COUNTRY: US 
; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 10/096, 241 

FILING DATE: 12-Mar-2002 



CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/ AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1884 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

FEATURE: 

; NAME/KEY: Coding Sequence 

; LOCATION: 664... 1883 

; OTHER INFORMATION: 

SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-10-096-241-5 

Query Match 90.5%; Score 900; DB 14; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 4.4e-253; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

| 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I 
218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I | I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I t I I I I I I I I I II I I I I I II I I I 
27 8 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I M | M I I I M I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I i I I I M M I I I I I I 
39 8 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I M I I I I II I I I I I I I I M I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I 
518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

Mill I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I II I I I I I II I 
578 CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 
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Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 


421 


T GC G C CAC C C G GC C CAAGT T GAAGAAGAT GAAGAG C C AGAC GG GACAGGT GGGT G AGAAG 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 II 1 1 1 M 1 1 1 1 




Db 


£ ^ 7 
DO / 


rvrrrzrr Arrccxzrrr a AftTTft a Aft a Aft at ft a Aft AftrrAGACGGGACAftftTGGGTGAGAAG 


696 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 




I 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 




r* A ATrftTTftA AftTftTftAftftP AftP AftrPftftTAATrrrr AGCCTTrCTArCGTTGGTTCAAG 


756 


Qy 


541 


GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 


600 




1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 M 1 




Db 


/Of 


r* a Tr r n a a r 1 r* a r , r ,r r a a r* ft ftft a ftp c ft a ft a p a tt p ftp at p A A AT at ftft P A A P ftftP AftAAAft 


816 


Qy 


601 


AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 


660 




1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


81/ 


a ArTrarrB rT ar spttp a a appTfz a Anr^Tftrz APf^APftPTftftftftAftT ATftTPTftPftAft 


876 


Qy 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 




1 I I I I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


Oil 


rrTT'T^r*^ ATrrTrrrra AfzrzAP APPftTPPftftftftPPftftPTTTAPftTPA AP AftPftTftAftP 

bLLbAuAAL/Al LLlubbuAAbbAUA^Ool^^^ubb^L.ljoLl 1 1 AL-ul \^j^J-\.^J-\XD\^.\y ± <jr\'<j<^ 


936 


Qy 


721 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 


780 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


O O T 




996 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCTW^TGT 


840 




1 I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


997 


T GC GT CAAT GGAGGCGT CT GCT ACT ACAT CGAGGGCAT CAACCAGCT CT C CT GCAAAT GT 


1056 


Qy 


841 


C CAAAT GGAT T CT T C GGAC AGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT GC C A 


900 




II 1 1 1 1 1 1 1 1 M I 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 i 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 
M 1 1 1 1 M II 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 




Db 


1057 


C CAAAT G GAT T CT T C G GACAGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT GC C A 


1116 


Qy 


q n i 


PnTPPT 1 A AftP A AAftTftTPPT Q9fl 

bAl^^lAAlj^AAAUlUJ.^^1 -/£.\J 




Db 


1117 


II 1 1 1 1 1 1 I 1 1 1 1 IN 
1 1 1 1 i 1 1 1 t 1 1 1 1 lit 

GAT C CT AAG CAAAAG C AC CT 113 6 




RESULT 


3 






US-09-E 


364-675- 


-3 




; Sequence 3, 


Application US/09864675 





Patent No. US2002008128 6A1 



GENERAL INFORMATION: 
APPLICANT: Marchionni, Mark 

TITLE OF INVENTION: NRG- 2 NUCLEIC ACID MOLECULES, 

TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

FILE REFERENCE: 04585/049002 

CURRENT APPLICATION NUMBER: US/ 09/ 8 64 , 675 

CURRENT FILING DATE: 2001-05-23 

PRIOR APPLICATION NUMBER: US 60/206,495 

PRIOR FILING DATE: 2000-05-23 

NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 897 
TYPE: DNA 

ORGANISM: Homo sapiens 



US-09-864-675-3 



Query Match 85.4%; Score 849; DB 9; Length 897; 

Best Local Similarity 98.3%; Pred. No. 3e-238; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I M I I I I I I I I I I I I I I I II I I I I I I I I I 1 I I I I I I II I M I I I I I I I I I I I I I I 1 I I 
Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I M 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I 1 I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I II I II I I I I I I I I I I I I I I I I I I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I M I I I I I I I I I I I II I I I I I I I I 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 CC C CT C GAT AC CAAC GG CAAAAAT C T CAAGAAAGAGGT GGGCAAGAT C CT GT GCAC T GAC 42 0 

I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 361 CC C CT C GAT AC CAAC GG CAAAAAT C T CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

Qy 421 T G C GC C AC C C G GC C CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 421 T G C GC CAC C C G GC C CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT G GGT GAGAAG 48 0 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

Qy 541 GAT G GCAAGGAGC T CAAC C G C AG C C GAGAC AT T C GC AT CAAAT AT GGCAAC GGC AGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I 
Db 541 GAT GGCAAGGAGCT CAAC CGCAGCC GAGACATT CGCAT CAAAT AT GGCAAC GGC AGAAAG 600 

Qy 601 AACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGACGCTGGGGAGT AT GTCT GC GAG 660 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I 
Db 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I II I I I I I I I I I I I I I I II I I I II 
Db 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 721 AC CAC C C T GT CAT C CT G GT C GG GG C AC GC C C GGAAGT G CAAC GAGAC AG C CAAGT C C TAT 7 80 



7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

M | I I I 1 I I M II I I I I I M I I I I I I I I I I I I I 

7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

841 C CAAAT GGAT T CT T C G GAC AGAGAT GT T T GGAG 873 

|| I II I I III I I I I I I I M 

841 C CT GT GGGAT ACAC CGGGGACAGGT GT C AGC AG 873 



RESULT 4 
US-10-096-241-3 

; Sequence 3, Application US/10096241 
; Publication No. US2002 0127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 
; COUNTRY: US 

; ZIP: 02110-2804 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/ 096, 24 1 
; FILING DATE: 12-Mar-2002 

CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 3: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 1607 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

FEATURE: 

; NAME/KEY: Coding Sequence 

LOCATION: 7 9... 621 



OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
US-10-096-241-3 



Query Match 55.1%; Score 547.2; DB 14; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 9.6e-150; 

Matches 57 6; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

| | | | | | | 1 | | I I | I I I I I I I I I I I I II I I 1 I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 

Qy 431 GGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGA 490 

| M | I I I I I I I I I I I I I I I I I I I I I I I II I III I I I I I I M M I I I I I I I I I I I I 
Db 62 GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

M I I I I I II II I I I I II M I I I I I I I I I I M I M I M I I I M I I I I I I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGAC 610 

| | | M I I I I II II II I II I I M M I I I I II I I I I I I I I I I I II I I I I I I I I 
Db 182 AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAG7WVGAACTCACGGC 241 

Qy 611 T AC AGT T C AAC AAG GT GAAG GT G GAGGAC G CT GG G GAGT AT GT C T G C GAGG C C GAGAAC A 67 0 

| M | | | I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I M I I I I I I 
Db 242 TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

| | | I | I MM II I II I I I I I II II II I I I II I I II II 

Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 7 31 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

M II I I I I I I I I I II I I I I I M I I II I M Mill M I I I I I I I I I M M MIT 

Db 3 62 CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 421 

Q y 791 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 850 

M | || II I I I I I I I M I I I I I I I M II I M II I I I II I I MM 

Db 422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 4 81 

Qy 851 TCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGATCCTAAGC 910 

| | | || | || | | M II I II II I I I I II I I I I I II I II I I M I I I I II II I I I M II II M II 
Db 482 TCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGATCCTAAGC 541 

Qy 911 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 970 

| | | | | || I I I I I I I I I I I I I I I I I I I I I I M M M M M M II I II I M I I I I I I M I I I 
Db 542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

Qy 971 C AAG C AC C T T GGAT T T GAAT T AAA 994 

I I I I II I M II I I II I I I I M M 
Db 602 C AAG C AC C T T G GAT T T GAAT T GAA 625 



RESULT 5 
US-10-096-241-7 

; Sequence 7, Application US/10096241 
; Publication No. US20020127594A1 



; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 
; ZIP : 02110-2804 

; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/ 096 , 24 1 

FILING DATE: 12-Mar-2002 
; CLASSIFICATION: <Unknown> 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX : <Unknown> 

; INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1476 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

FEATURE: 

; NAME/KEY: Coding Sequence 

; LOCATION: 69... 1475 

OTHER INFORMATION: 
; SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

US-10-096-241-7 

Query Match 49.5%; Score 492; DB 14; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 1.4e-133; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0 

Qy 363 C CT C GAT AC C AAC G G CAAAAAT CT CAAGAAAGAGGT G GG CAAGAT C CT GT GC ACT GACT G 422 

I I I I I I I III I I I I I I I I I I I I M 

Db 98 C C GC GG C AAGAAGCAC C CAGAG G GGAGGAAG C GG GAGAGG GAGC C C GAT C CC GG G GAGAA 157 



QY 



423 C G C C AC C C G GC C CAAGT T GAAGAAGAT GAAG AGC C AGAC GGGACAGGT G GGT GAGAAGC A 482 
I I I I I II I I I I I I I I I M I I I II II II I I M I I I I I I I I I I I I i M I I I I I I I I M I I I 



D b 158 AGC C AC C C G GC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GG GACAGGT GG GT GAGAAG C A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

| | | | I | | | | | | | I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I M I I I I I > I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T G G CAAG GAG C T CAAC C GCAGC C G AGAC AT T C G CAT C AAAT AT GGCAAC G G C AGAAAGAA 602 

| | | | | | | | | | | I I I I I I I I I 1 I I II II I I I I I I I M I I I I I I M I I M I M I I M I M I I 
Db 278 T GGC AAG GAGCT C AAC CG CAG C C GAGAC AT T C G CAT CAAAT AT GGCAAC GGC AGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

| | | | | | | | M | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I 
D b 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 C GAGAACAT C CT GG GGAAGGAC AC CGTCCGGGGCCGGCTT T AC GT CAAC AG C GT GAGC AC 722 

| | | M I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I M M I I I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCTATTG 7 82 

M | I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I 1 I I I I I M I I I 
Db 458 CACCCTGT CAT CCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGT CCTATTG 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

| | | I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 518 CGT CAAT GGAGGCGT CT GCTACT ACAT CGAGGGCAT CAACCAGCT CT CCT GCAAAT GT CC 577 

Qy 843 AAAT GGATT C T T CG GAC AGAGAT GT TT G GAGAAACT GCCT T T G C GAT T GT AC AT GC CAG A 902 

| | | | | I I I I I M II I I I I I M I I I I I I I M I I I I M I I I M I I I I I I I I I I I I I I I I I I I 
Db 57 8 AAAT GGAT T CTT C GGAC AGAGAT GT T T G GAGAAACT GCCTTT GC GAT T GT AC AT G C C AGA 637 



Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 6 

US-10-096-241-31 

; Sequence 31, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

; CITY: Boston 

; STATE: MA 

; COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 



; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 
; FILING DATE: 12-Mar-2002 

; CLASSIFICATION: <Unknown> 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

; INFORMATION FOR SEQ ID NO: 31: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

; NAME/ KEY: Coding Sequence 

LOCATION: 69... 2009 

OTHER INFORMATION: 
; SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

US-10-096-241-31 

Query Match 49.5%; Score 492; DB 14; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 1.6e-133; 



Matches 


516; Conservative 0; Mismatches 40; Indels 0; Gaps 


0; 


Qy 


363 


CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCT^AGATCCTGTGCACTGACTG 

M 1 1 1 1 1 III MM 1 1 1 1 1 1 1 1 M 
C C GC G GCAAGAAGC AC C C AGAGGGGAG GAAG C GG GAGAG GGAGC C C GAT C C C GGGGAGAA 


422 


Db 


98 


157 


Qy 


423 


C GC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GG GAC AG GT GGGT GAGAAGCA 

| | | | | | | | | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 

AGC C AC C CGGC C CAAGT T GAAGAAGAT GAAGAG C CAGAC G GGAC AGGT G GGT GAGAAGCA 


482 


Db 


158 


217 


Qy 


483 


ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 

1 | | | | I | I I I 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 
ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 


542 


Db 


218 


277 


Qy 

Db 


543 
278 


T GGCAAGGAG CT C AAC C GCAGC C GAGAC AT T C G CAT CAAAT AT GGCAAC GGC AGAAAGAA 

| | | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 

T GGCAAGGAG C T CAAC C GCAG C C GAGAC AT T C GC AT CAAAT AT GGCAAC GGC AGAAAGAA 


602 
337 


Qy 


603 


CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 

| | | | | | | I I I I I 1 II I 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 


662 


Db 


338 


397 


Qy 


663 


C GAGAAC AT C CT G G GGAAGGAC ACC GT C C GG G GC C G G CTT T AC GT CAAC AGC GT GAGC AC 

I | | I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 II 1 1 1 M 1 1 1 

C GAGAAC AT C C T G G GGAAG GACACC GT CCGGGGCCGGCTT T AC GT CAAC AG C GT GAGC AC 


722 


Db 


398 


457 



Qy 723 CAC C CT GT CAT CCTGGTCGGGG C AC GC C C GGAAGT GCAAC GAGAC AG C CAAGT CC TAT T G 782 

I I I I I M I I I I I I I MM M M M M M M 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I MM M M M M I M M M M M M M M M 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT G GAT T CT T C G GACAGAGAT GT T T G GAGAAACT GCCTTTGC GAT T GT AC AT GC C AGA 902 

MM M M M M I I M M M I MM 

Db 57 8 AAAT GGAT T CT T C GGAC AGAGAT GT T T G GAGAAACT GC CT T T GC GATT GT AC AT GC C AGA 637 

Qy 903 T CCT AAGCAAAGT GT C 918 

M M M M M I I I 
Db 63 8 TCCTAAGCAAAAAGCC 653 



RESULT 7 
US-10-096-241-1 

; Sequence 1, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
; FILING DATE: 12-Mar-2002 

; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/ AGENT INFORMATION: 

NAME: Fasse, J. Peter 
; REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 2467 base pairs 



TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY : circular 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY : Coding Sequence 
LOCATION: 79. . .1893 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-10-096-241-1 

Query Match 47.1%; Score 467,8; DB 14; Length 2467; 

Best Local Similarity 91.3%; Pred. No. 2e-126; 

Matches 496; Conservative 0; Mismatches 47; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGC7^AGATCCTGTGCACTGACTGCGCCACCC 430 

I | | | | | | I I I I I | | I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I 
Db 2 CT AAC G G CAAAAACAT CAAGAAAGAG GT GGG CAAGAT C CT GT GC ACT GACT G C G C CAC C C 61 

Qy 431 GGC C CAAGT T GAAGAAG AT GAAGAGC C AGAC G G GAC AG GT GGGT GAGAAGCAAT C GCT GA 4 90 

| | 1 | | | | I t I II I I I I I I I I I I I I M II I I III MINI Mil I 

Db 62 GGCC CAAGCT GAAGAAGAT GAAGAGCCAGACAGGAGAGGT GGGTGAGAAGCAGT C GCT CA 121 

Qy 4 91 AGT GT GAGG C AGC AGC C GGTAAT C C C CAG C CT T C CT AC C GT T G GT T CAAG GAT GGCAAG G 550 

MINI I N I I I I I I I I I I I I I I 

Db 122 AGT GTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTT CAAG GAT GGCAAGG 181 

Qy 551 AG CT CAAC C G CAG C C GAGAC AT T C GC AT CAAAT AT GGC AAC G GC AGAAAGAACT CAC GAC 610 

| | | | I I I II II II M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 182 AACT CAAC C GGAGT C GT GAT AT T C GC AT CAAGT AT G GCAAT GT C AGAAAGAACT CAC GGC 241 

Qy 611 TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 670 

M II I I I I II I II I I I I I I I I I I I I I M I I I I I II I II I I I I I I II I II I I I I I 
Db 242 T AC AGT T CAACAAAGT GAGGGT GGAGGAT GC C G GG GAGT AC GT CT GT GAG GC C GAGAAC A 301 

Qv 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

MM I I I I IN I I I I M I I I I I I I I M I I II I I I I I 

Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CAT C CT GGT C G G GGC AC G C C C G GAAGT GCAAC GAGAC AG C CAAGT C CT AT T GCGT CAAT G 7 90 

MUM M II I I I I I MM I MM 

Db 362 CAT C CT GGT C G G GAC AT G C C C GGAAGT GCAAT GAGAC C G C CAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAG GCGTCTGCTACTACATC GAG GGCAT CAAC CAGCTCTCCTGCAAATGTC CAAAT GGAT 850 

M II I II I II I II II II M I II II II I I II M II I II I I I I I II I M I M I I I I MM 

Db 422 GAG GCGTGTGCTACTACATCGAGGGCAT CAAC CAGCTCTCCTGCAAATGTCCAAAC GGAT 4 81 

Qy 851 T CT T C GGAC AGAGAT GTT T G GAGAAACT GC CT T T G C GAT T GT AC AT GC C AGAT C CT AAG C 910 

M I I I II I I II II I II I I I I I M M M I I M I I I I I II M I II I II I I I I I I I I I 

Db 482 T CT T C GGACAGAGAT GTT T GGAGAAACT GC CT TT GC GAT T GT AC AT GC C AGAT C CT AAGC 541 

Qy 911 AAA 913 

I I I 

Db 542 AAA 544 



RESULT 8 



US-10-271-416-6 

Sequence 6, Application US/10271416 
Publication No. US20040043021A1 
GENERAL INFORMATION: 
APPLICANT: Keith, Tim 
APPLICANT: Little, Randall D. 
APPLICANT: Van Eerdewegh, Paul 
APPLICANT: Dupuis, Josee 
APPLICANT: Del Mastro, Richard G. 
APPLICANT: Allen, Kristina 

TITLE OF INVENTION: NUCLEOTIDE AND AMINO ACID SEQUENCES 
TITLE OF INVENTION: RELATING TO RESPIRATORY DISEASES AND OBESITY 
FILE REFERENCE: 2976-4045 

CURRENT APPLICATION NUMBER: US/10/271,416 
CURRENT FILING DATE: 2002-10-11 
PRIOR APPLICATION NUMBER: 60/328,424 
PRIOR FILING DATE: 2001-10-11 
NUMBER OF SEQ ID NOS : 9 

SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 6 

LENGTH: 22693 
TYPE: DNA 

ORGANISM: Homo sapien 
US-10-271-416-6 

Query Match 42.7%; Score 424.8; DB 13; Length 22693; 

Best Local Similarity 94.2%; Pred. No. 1.6e-113; 

Matches 452; Conservative 0; Mismatches 27; Indels 1; Gaps 1; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

1 | | | 1 | | | | | | I I 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


20809 


20868 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

IN I II 1 1 II I I I I 1 I 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 M 1 M 1 1 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


20869 


20928 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

I | | | M I M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 M M 1 M 1 1 1 1 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


20929 


20988 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

M 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 M 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


20989 


21048 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

1 | | | | | | || | | || I M II 1 1 1 1 1 M 1 II II 1 1 1 1 M 1 II 1 1 1 1 1 1 1 M 1 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


21049 


21108 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

| | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


21109 


21168 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 

|| | || II 1 1 II M II 1 M 1 1 1 1 1 M II 1 1 1 1 M 1 1 1 1 1 II 1 1 1 M 1 1 1 

C C C CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT GGG CAAGAT C CT GT G C AC T GAC 


420 


Db 


21169 


21228 



Qy 421 T GC - GC C AC C C G G C C C AAGT T GAAGAAG AT GAAGAGC C AGAC GG GAC AGGT GGGT GAGAA 47 9 

| | | | 1 I I 11 I I IMM I I I I I I I I I I I 

Db 21229 TGCGGTGAGTCGCCCCCTCCCTTTGCTGGAGAAAGGGGGGAGGGGCGAGGTGGTGGAGAA 2128 8 



RESULT 9 

US-10-447-839A-10 

; Sequence 10, Application US/10447839A 

; Publication No. US20040018181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/ 10/4 47 , 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 

; PRIOR FILING DATE: 2002-11-13 

; PRIOR APPLICATION NUMBER: 09/951,938 

; PRIOR FILING DATE: 2001-09-11 

; PRIOR APPLICATION NUMBER: 60/231,841 

; PRIOR FILING DATE: 2000-09-11 

; NUMBER OF SEQ ID NOS : 109 

; SOFTWARE: Patentln version 3.2 

; SEQ ID NO 10 

LENGTH: 1054 
; TYPE: DNA 
; ORGANISM: Homo Sapien 
US-10-447-839A-10 

Query Match 42.7%; Score 424; DB 16; Length 1054; 

Best Local Similarity 100.0%; Pred. No. le-113; 

Matches 424; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I I M M II I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I 
58 9 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 64 8 

61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

| | | | | | | I I I I I I I i I I I I I I I I I I I I I II M I I I II II I II I I I I I I M I I I I I I I I I I 
649 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 708 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

M | II I I I I I I I I I I I II I I M II I I M M I I I I I I I I I I I I II I I I I I I I I I M I II I I 

7 09 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 7 68 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I | M | | | | || I I I I I II I II I I I II I I I I I II I II II II I II I I I I I I I I I I I I I I I I I I 
7 69 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 82 8 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I | M | I I I I II I I I I I I I I I I II II M I I I I I I I I I I I M I I I I I I I I I I I I II I I I M I 

82 9 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 888 
301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 88 9 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 948 

Qy 361 C C C CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT G C ACT GAC 420 

II I II I I I I I I I I I I I I I I I I M M I I I I I I I II I I I I I I II 

Db 94 9 C C C C T C GAT AC CAAC GG CAAAAAT CT CAAGAAAGAGGT GGG CAAGAT C CT GT G C ACT GAC 1008 

Qy 421 TGCG 424 

I I I I 

Db 1009 TGCG 1012 



RESULT 10 

US-10-02 9-38 6-2 6613 

Sequence 26613, Application US/10029386 
Publication No. US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/10/029, 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 342 88 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 26613 
LENGTH: 2 01 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: MAP TO CHR5 . 3 

OTHER INFORMATION: EXPRESSED IN HEART, SIGNAL =0.55 
OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL =0.49 
OTHER INFORMATION: EXPRESSED IN FETAL LIVER, SIGNAL = 0.72 
OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL = 0.66 
OTHER INFORMATION: SWISSPROT HIT: 014511, EVALUE 3.00e-29 
OTHER INFORMATION: NT HIT: AF119152.1, EVALUE 1.00e-109 
OTHER INFORMATION: EST_HUMAN HIT: BF108794.1, EVALUE 3.00e-93 
US-10-02 9-38 6-2 6613 

Query Match 17.4%; Score 173; DB 15; Length 201; 

Best Local Similarity 100.0%; Pred. No. 2.1e-40; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 42 4 GC C AC C CGGC C CAAGTT GAAGAAGAT GAAG AGC CAGAC GG GACAGGT G G GT GAGAAGCAA 4 83 

I I I I I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I M I I I M I I I I I I 

Db 27 G C C AC C C G GC CCAAGTT GAAGAAGAT GAAGAG C CAGAC GGGACAGGT G GGT GAGAAGCAA 8 6 

Q y 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I M I I I I M II I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I II 
Db 87 T C GCT GAAGT GT GAG G CAGCAG C CG GT AAT C C C C AGC CT T C CT AC C GT T GGT T CAAGGAT 14 6 



Qy 



544 GG C AAGGAGC T CAAC C G C AGC C GAGAC AT T C GCAT CAAAT AT G G CAAC GGC AG 596 
I I I I I II I II I I I I I I I I II I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I 



Db 147 G GCAAGGAG CT C AACC GC AG C C GAGAC AT T C G CAT CAAAT AT GG CAAC GG C AG 199 



RESULT 11 
US-10-447-839A-11 

Sequence 11, Application US/10447839A 
Publication No. US20040018181A1 
GENERAL INFORMATION: 
APPLICANT: Kufe, Donald W. 
APPLICANT: Kharbanda, Surender 
APPLICANT: Weitman, Steven D. 

TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

FILE REFERENCE: 1000.05.009 

CURRENT APPLICATION NUMBER: US/ 10/ 4 47 , 839A 
CURRENT FILING DATE: 2003-05-29 
PRIOR APPLICATION NUMBER: 10/293,391 
PRIOR FILING DATE: 2002-11-13 
PRIOR APPLICATION NUMBER: 09/951,938 
PRIOR FILING DATE: 2001-09-11 
PRIOR APPLICATION NUMBER: 60/231,841 
PRIOR FILING DATE: 2000-09-11 
NUMBER OF SEQ ID NOS: 109 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 11 
LENGTH: 419 
TYPE: DNA 

ORGANISM: Homo Sapien 
US-10-447-839A-11 

Query Match 17.4%; Score 173; DB 16; Length 419; 

Best Local Similarity 100.0%; Pred. No. 2.7e-40; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 424 GC C ACC C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAGCAA 4 83 

I | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I i i I I I I I I I I I I I I I I M 
Db 50 GCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGCAA 109 

Qy 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I II II I I I I I I I I I I I I I I I 
Db 110 T C GCT GAAGT GT GAG GC AGC AGC C GGTAAT C CC CAGC CT T C C T AC CGTTGGTT CAAG GAT 169 

Qy 544 GG CAAG GAG CT CAAC C G C AGC C GAGAC AT T C GCAT CAAAT AT GGCAAC GGC AG 596 

I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I II II I I I 
Db 170 GG CAAGGAGCT CAAC C GC AG C C GAG ACATT C GCAT CAAAT AT G GCAAC GG C AG 222 



RESULT 12 

US-10-029-386-12913 

; Sequence 12913, Application US/10029386 

; Publication No. US200301947 04A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel, David K. 

; TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 



TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/10/029 , 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 12913 
LENGTH: 573 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

OTHER INFORMATION: MAP TO CHR5 . 3 

OTHER INFORMATION: EXPRESSED IN HEART, SIGNAL =0.55 
OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL = 0.49 
OTHER INFORMATION: EXPRESSED IN FETAL LIVER, SIGNAL =0.72 
OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL =0.66 
OTHER INFORMATION: SWISSPROT HIT: 014511, EVALUE 2.00e-28 
OTHER INFORMATION: NT HIT: AF119152.1, EVALUE 0.00e+00 
OTHER INFORMATION: EST_HUMAN HIT: BG996653.1, EVALUE 1.00e-108 
US-10-029-386-12913 

Query Match 17.4%; Score 173; DB 15; Length 573; 

Best Local Similarity 100.0%; Pred. No. 3e-40; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 424 G C CAC C C G GCC CAAGT T GAAGAAGAT GAAGAG C C AGAC G GG AC AG GT G GGT GAGAAGCAA 483 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I M 
Db 377 GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 43 6 

Qy 484 T C GC T GAAGT GT GAG GCAGC AG C C G GT AAT C C C CAG C CT T C CT AC C GT T GGT T CAAGGAT 543 

M I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 11 I I I I I I II 
Db 437 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 496 

Qy 544 GGCAAGGAGCT CAACCGCAGC C GAGACAT T CGCAT CAAATAT GGCAAC GGCAG 596 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 497 GG C AAG G AGCT C AAC C G CAGC C GAGACAT T C GC AT CAAATAT GGCAAC G G CAG 549 



RESULT 13 
US-10-271-416-7 

Sequence 7, Application US/10271416 
Publication No. US20040043021A1 
GENERAL INFORMATION: 
APPLICANT: Keith, Tim 
APPLICANT: Little, Randall D. 
APPLICANT: Van Eerdewegh, Paul 
APPLICANT: Dupuis, Josee 
APPLICANT: Del Mastro, Richard G. 
APPLICANT : Allen, Kristina 

TITLE OF INVENTION: NUCLEOTIDE AND AMINO ACID SEQUENCES 
TITLE OF INVENTION: RELATING TO RESPIRATORY DISEASES AND OBESITY 
FILE REFERENCE: 2976-4045 

CURRENT APPLICATION NUMBER: US/10/271,416 
CURRENT FILING DATE: 2002-10-11 
PRIOR APPLICATION NUMBER: 60/328,424 
PRIOR FILING DATE: 2001-10-11 
NUMBER OF SEQ ID NOS: 9 



SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 7 

LENGTH: 45450 
TYPE: DNA 

ORGANISM: Homo sapien 
FEATURE : 

NAME /KEY : misc_f eature 
LOCATION: (1) . . . (45450) 
OTHER INFORMATION: n = A,T,C or G 
US-10-271-416-7 

Query Match 17.4%; Score 173; DB 13; Length 45450; 

Best Local Similarity 100.0%; Pred. No. 1.2e-39; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 42 4 G C C AC C C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC G GGACAGGT GGGT GAGAAGCAA 483 

I I I I ! I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I M I I I I I I I I I I I I I I I I I M 
Db 7026 G C C AC C C GG C C CAAGT T GAAGAAGAT GAAGAGC CAGAC GG GACAGGT GGGT GAGAAGCAA 7 08 5 

Qy 4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I 
Db 7086 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 7145 

Qy 544 GGCAAGGAG CT CAAC C G C AGC C GAGAC AT T C GCAT CAAAT AT GGCAAC GGC AG 596 

I I I I I I I I I I I I II I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 714 6 GGCAAGGAG CT CAAC C G C AGC C GAGAC ATT C GCAT CAAAT AT GGCAAC GGC AG 7198 



RESULT 14 
US-10-447-839A-12 

; Sequence 12, Application US/10447839A 

; Publication No. US20040018 181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT : Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/10/447, 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 

; PRIOR FILING DATE: 2002-11-13 

; PRIOR APPLICATION NUMBER: 09/951,938 

; PRIOR FILING DATE: 2001-09-11 

; PRIOR APPLICATION NUMBER: 60/231,841 

PRIOR FILING DATE: 2000-09-11 
; NUMBER OF SEQ ID NOS: 109 
; SOFTWARE: Patentln version 3.2 
; SEQ ID NO 12 
LENGTH: 4 93 
TYPE : DNA 
; ORGANISM: Homo Sapien 
US-10-447-839A-12 



Query Match 12.5%; Score 124.6; DB 16; Length 493; 

Best Local Similarity 93.5%; Pred. No. 4.3e-26; 



Matches 130; Conservative 0; Mismatches 9; Indels 0; Gaps 0; 



Qy 



594 CAGAAAGAACT C AC GACT AC AGT T CAACAAG GT GAAG GT G G AGGAC GC T G GG GAGT AT GT 653 




Db 



227 CAGAAAGAACT C AC GACT AC AGT T CAACAAG GT GAAG GT G GAG GAC G CT GG G GAGT AT GT 2 86 



Qy 



654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 713 




Db 



2 87 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 34 6 



Qy 



714 CGT GAGCAC CAC C CT GT CA 732 



Db 



347 C G GT AG GT G G G C C C AG AC A 365 



RESULT 15 
US-10-447-839A-13 

; Sequence 13, Application US/10447839A 

; Publication No. US20040018181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/10/447, 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 

; PRIOR FILING DATE: 2002-11-13 

; PRIOR APPLICATION NUMBER: 09/951,938 

PRIOR FILING DATE: 2001-09-11 
; PRIOR APPLICATION NUMBER: 60/231,841 
; PRIOR FILING DATE: 2000-09-11 
; NUMBER OF SEQ ID NOS : 109 

SOFTWARE: Patentln version 3.2 
; SEQ ID NO 13 
LENGTH: 350 
TYPE: DNA 
; ORGANISM: Homo Sapien 
US-10-447-839A-13 

Query Match 12.3%; Score 122.4; DB 16; Length 350; 

Best Local Similarity 99.2%; Pred. No. 1.7e-25; 

Matches 123; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 715 GT GAGCAC CAC C CT GT CAT CCTGGTCGG G GCAC G CC C GGAAGT GCAAC GAGAC AGC CAAG 774 

I I I I I I I I I I I I I I I I I 1 I 1 I II I I I I I I I I M I I I M I I I I 1 I I I I I I I I I M I I I I I I 
Db 99 GT GAG CAC CAC C CT GT CAT C CT G GT CGG GGCAC G CC C G GAAGT GCAAC GAGACAG C CAAG 158 

Qy 775 TCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGC 834 

I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 159 TCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGC 218 

Qy 835 AAAT 838 

I I I 

Db 219 AAGT 222 



Search completed: August 15, 2004, 12:23:31 
Job time : 545.161 sees 



